TECH 2026-03-08 11 min

From a Single Server to the Cloud: How We Scale legal.org.ua on Google Cloud

Name: LEX
Author: SecondLayer

Cloud Run with autoscaling to zero. Cloud SQL with automatic backups. Qdrant on a dedicated VM. All infrastructure at $280-430/mo with the ability to scale from 10 to 10,000 users without architecture changes.

From a Single Server to the Cloud: How We Scale legal.org.ua on Google Cloud

How we migrated a legal AI platform from Docker Compose on a single server to full-fledged cloud infrastructure with automatic scaling.

Why Migration Became Necessary

legal.org.ua is a platform for lawyers with AI analysis of court decisions, semantic search across legislation, and registries. Under the hood — 3 microservices, PostgreSQL, Redis, Qdrant (vector DB), MinIO, and a React frontend.

The initial infrastructure was a single VPS with Docker Compose. It worked for the MVP but created risks:

We needed infrastructure that scales automatically, has automatic backups, and costs reasonable money for a startup.

Choosing a Cloud: Why Google Cloud

We considered AWS, GCP, and Hetzner Cloud. We chose GCP for several reasons:

Cloud Run — the main argument. Serverless containers with pay-per-use pricing and the ability to scale to zero. For a legal platform with daytime traffic (lawyers work 9 to 6), this means we pay almost nothing at night and on weekends.

Cloud SQL — managed PostgreSQL with automatic backups, point-in-time recovery, and one-click vertical scaling.

Region europe-west1 (Belgium) — closest to Ukraine with the best pricing among European GCP regions.

Architecture: Hybrid Approach

The key decision — not everything in serverless. We split services by nature:

              Cloudflare (DNS + CDN + WAF)
                        |
              Cloud Load Balancer (HTTPS)
             +———-+———-+
        Cloud Run    Cloud Run    Cloud Run
      (mcp_backend) (mcp_rada) (openreyestr)
             +———-+———-+
        +——-+——-+——-+——–+
     Cloud SQL  Memorystore   GCE VM    GCS
     (PG 15)    (Redis 7)   (Qdrant) (files)

Stateless Services → Cloud Run

Our 4 backend services do not maintain state between requests — ideal candidates for Cloud Run:

Note the min instances: the main backend always has at least 1 instance (cold start is unacceptable for AI chat with SSE streaming), while auxiliary services scale to zero when nobody is using them.

Stateful Services → Managed or VM

PostgreSQL → Cloud SQL (managed, automatic backups, point-in-time recovery)
Redis → Memorystore (managed, sub-millisecond latency)
Qdrant → GCE VM (no managed option, needs persistent storage)
MinIO → GCS (Google Cloud Storage with S3-compatible API)

Networking: Security by Default

All infrastructure lives in a private VPC network. No service has a public IP except the Load Balancer.

VPC: secondlayer-vpc
+– services-subnet   10.0.0.0/20    (Cloud Run VPC Connector)
+– data-subnet       10.0.16.0/20   (Cloud SQL, Qdrant VM)
+– VPC Connector     10.8.0.0/28    (Cloud Run → private network)

Cloud NAT provides outbound internet for VMs without public IPs. IAP (Identity-Aware Proxy) — SSH access to VMs via Google authentication instead of an open port 22.

Firewall rules are simple: only internal traffic between subnets, SSH via IAP, and health checks from Google Load Balancer are allowed.

Cloud SQL: Two Instances

We deliberately split PostgreSQL into two instances:

secondlayer-main (db-custom-2-8192) — main backend and parliament data:

Database secondlayer_prod: court decisions, documents, AI analytics, users
Database rada_prod: deputies, bills, voting

openreyestr-db (db-custom-1-4096) — State Register of legal entities:

Pre-imported database with millions of records
Read-heavy workload, rarely written
Separate instance prevents lock contention with the main database

Both instances have:

Private IP only (not accessible from the internet)
Automatic nightly backups at 3:00
Point-in-time recovery
max_connections=500 (sufficient for Cloud Run with connection pooling)

Qdrant on a Dedicated VM

Qdrant is the vector database for semantic search. GCP has no managed option, so we deployed it on a separate VM:

e2-standard-4 (4 vCPU, 16 GiB RAM) — sufficient for millions of vectors
100 GB persistent disk (pd-balanced) — data survives VM deletion
Docker container with –restart=always

Persistent disk is the key detail. Even if the VM crashes or needs an upgrade, data stays on the disk. We can change the VM type in 5 minutes without losing indexes.

GCS Instead of MinIO: Zero Code Changes

One of the most elegant decisions: Google Cloud Storage has an S3-compatible API. Our code uses the AWS S3 SDK to work with MinIO. For migration, we only changed the endpoint:

# Before (MinIO)
MINIO_ENDPOINT=minio-stage
MINIO_PORT=9000

# After (GCS)
MINIO_ENDPOINT=storage.googleapis.com
MINIO_PORT=443
MINIO_USE_SSL=true

Not a single line of code was changed. The same upload pipeline, the same presigned URLs, the same logic.

Secrets: Secret Manager Instead of .env Files

On the VPS, secrets lived in .env files. It works, but:

The file could end up in git
No audit of who accessed what when
Key rotation = manual update on the server

GCP Secret Manager solves all three problems. Every secret has versions, access auditing, and integrates directly with Cloud Run via –set-secrets.

We created 12 secrets: OpenAI API keys, ZakonOnline tokens, JWT secret, database passwords, and others.

Cost: 280 to 430/mo

Full breakdown:

| Component | Specification | $/mo | |———–|————–|——| | Cloud Run (4 services) | Autoscaling | $76 | | Cloud SQL (2 instances) | PG 15, SSD, auto backups | $150 | | Memorystore Redis | 2 GiB, Basic | $50 | | GCE VM (Qdrant) | e2-standard-4, 100 GB disk | 105 | | GCS + CDN | ~50 GB of files | 8 | | Networking (LB, NAT, VPC) | | $33 | | Artifact Registry | Docker images | 3 | | Total | | ~430 |

Optimization to $280/mo

Consolidate Cloud SQL — openreyestr as a separate database in the main instance: -$55
1-year commitment on Cloud SQL: -$37
Spot VM for Qdrant (if restart is acceptable): -$60

Scaling Strategy

Horizontal (Automatic)

Cloud Run scales automatically by concurrency. When load increases — instances are added. When it drops — excess instances are shut down.

08:00  mcp-backend: 1 instance  (quiet morning)
10:00  mcp-backend: 2 instances (workday)
14:00  mcp-backend: 4 instances (peak activity)
22:00  mcp-backend: 1 instance  (evening)
02:00  mcp-rada: 0 instances    (nobody searches for deputies at night)

Vertical (Manual, As Needed)

What Changes as You Grow

10 → 100 users: current architecture handles it without changes.

100 → 1,000 users: add Cloud SQL read replica ($95/mo), increase Cloud Run max instances to 8.

1,000+ users: migrate to GKE Autopilot for more granular control, Qdrant cluster (3 nodes), Cloud SQL HA.

Frontend: GCS + Cloud CDN

React SPA (Vite build) is just static files. Instead of a Cloud Run container, we host them on GCS with Cloud CDN:

Cost: ~1/mo (instead of ~15 for a Cloud Run container)
Latency: files served from the nearest edge to the user
Cache hit ratio: >95% for JS/CSS bundles

Cloudflare Stays

We did not replace Cloudflare with GCP Cloud Armor. Cloudflare remains the first layer of protection:

Free WAF — protection from SQL injection, XSS
DDoS protection — automatic attack absorption
Edge caching — static assets served from the Kyiv PoP
Origin CA — SSL certificate already configured

Cloudflare DNS A-record points to the Google Cloud Load Balancer IP. Traffic: user → Cloudflare edge → GCP LB → Cloud Run.

CI/CD: Automated Deployment

GitHub Actions workflow on merge to main:

Build packages/shared (shared types)
In parallel: build 4 Docker images → push to Artifact Registry
Deploy each service to Cloud Run
gsutil rsync the frontend to GCS

Rollback is one command: Cloud Run lets you switch traffic to a previous revision in seconds.

What Is Next

This architecture is the foundation we build on. Next steps:

Cloud Scheduler — automatically reduce min-instances at night
Cloud SQL Insights — slow query monitoring
Prometheus + Grafana on the Qdrant VM — custom metrics
Workload Identity Federation — GitHub Actions without service account keys

The goal — infrastructure that scales with the product, rather than becoming its limitation.

If you are building a legal or any other SaaS on microservices — Cloud Run + Cloud SQL is an excellent start. You pay for what you actually use, not for idle servers.