03 β Database Topology & Data Isolation
How tenant data is physically and logically separated, how databases are created and migrated, and the connection/queue concerns that matter when scaling toward instance-per-tenant.
Three-tier topology
| Tier | Holds | Cardinality |
|---|---|---|
Core DB (CORE_DB_*) | The control plane: tenants, regions, licenses, deployments, domains, branding. | One, shared by all. |
| Region shared DB | Business rows for shared tenants, every row scoped by tenant_id. Name comes from Region.shared_database_name (e.g. ip4cms_shared). | One per region. |
| Dedicated tenant DB | One tenant's entire dataset, physically separate. Named tenant_<uuid> (uuid with -β_). | One per dedicated tenant. |
The switch between the last two is Tenant.has_dedicated_database (src/core-api/src/database/schema.ts:178 β "true = own DB (tenant_xxx), false = shared DB on region server", default true).
This gives two grades of data isolation:
- Logical (shared DB): cheap; isolation is enforced by
tenant_idfilters and per-tenant uniqueness. E.g.account_codeis unique per tenant via a partial indexux_account_tenant_account_code, not globally β because many tenants coexist in one shared DB. - Physical (dedicated DB): a separate database per tenant β the right grade for compliance/residency.
In production the Postgres engine is Azure Cosmos DB for PostgreSQL (Citus) β a distributed Postgres. Citus is designed for distributed shared tables (distribute by tenant_id), which fits the shared-DB model well. It is not ideal for hundreds of CREATE DATABASE tenant_x databases on one coordinator; for residency-driven dedicated tenants, per-region flexible-server Postgres is the better substrate than DB-per-tenant on a single Citus coordinator.
How the tenant-api resolves a database
The data plane reads connection settings from env (src/tenant-api/src/config/index.ts):
- Core DB:
CORE_DB_HOST/PORT/NAME/USER/PASSWORD/SSL, poolCORE_DB_MAX_CONNECTIONS(default 8). - Tenant DB host:
TENANT_DB_HOST/PORT/USER/PASSWORD/SSLβ the host/cluster; the database name is resolved per tenant. PoolTENANT_DB_MAX_CONNECTIONS(default 10). - Single-tenant pin (for
container_per_tenant):TENANT_ID(config/index.ts:19β "For container-per-tenant deployment") andTENANT_DB_NAME(:20) lock the instance to one tenant + DB.
getDatabase(tenantId) (src/tenant-api/src/database/connection.ts) returns a Drizzle client for a tenant, caching one connection pool per tenant in a Map (connection.ts:16). The driver is postgres.js.
Connection-count implication. In the shared model a single tenant-api process holds one pool (up to 10) per tenant it serves, and the prod process runs in pm2 cluster mode (multiple workers), so live usage βworkers Γ active_tenants Γ 10. In an instance-per-tenant model each instance holds ~10 (tenant) + 8 (core) β 18connections, so N tenants βN Γ 18connections against the cluster. Citus coordinators capmax_connectionsin the hundredsβlow thousands and amplify it with coordinatorβworker fan-out β so a connection pooler (PgBouncer, transaction mode) is required to scale instance-per-tenant. Azure Cosmos for PostgreSQL ships a managed PgBouncer on port 6432 (the app currently connects direct on 5432). Transaction-mode pooling requires disabling postgres.js prepared statements (prepare: false), and per-instance pools should be shrunk (2β4) for dedicated tenants.
Migrations β owned by the data plane
Drizzle migrations live at src/tenant-api/drizzle/ as hand-authored, idempotent SQL (every statement uses IF NOT EXISTS / IF EXISTS, partial unique indexes with WHERE, etc.). They are applied by the tenant-api's own migrator:
- On provisioning, core-api calls
POST /internal/database/initializeand the tenant-api runsinit.ts initialize()βCREATE DATABASE tenant_<uuid>(for dedicated) + all migrations. - On deploy,
src/deploy-ip4cms.shruns the compiled migrator (node dist/tenant-api/src/database/migrate.js, βnpm run db:migrate:prod) automatically after syncing the build (skippable with--no-migrate). The shared region DB and every applicable dedicated tenant DB are migrated; it is idempotent. init.ts initialize()scoped toconfig.TENANT_IDmigrates only that tenant β which is exactly what acontainer_per_tenantinstance does on boot.
Operational note: the migrator applies entries by a when ordering recorded in the Drizzle journal; a migration whose when is earlier than the last-applied is silently skipped β so new migrations must carry a strictly-later when.
Queue isolation {#queue-isolation}
BullMQ queues live in Redis, which is shared infrastructure. Queues are keyed by name, so isolation between tenant-api instances depends on a per-instance key prefix:
REDIS_PREFIX(config/index.ts:62β "Prefix for BullMQ keys to isolate from other BullMQ instances") namespaces all queue keys.- In the shared model, one process owns the queues for all its tenants β fine.
- In the instance-per-tenant model, every instance must set a distinct
REDIS_PREFIX(e.g. the tenant slug). Without it, multiple instances bind identical queue names on the one Redis and steal each other's jobs β silent cross-tenant job theft.
β οΈ This is the open gap from 02: thecontainer_per_tenantenv built byprovisionDeploymentomitsREDIS_PREFIX. It must be added before running more than one dedicated instance.
The same prefixing applies to the Redis-backed RedisLockService (used by document numbering and settlement matching).
What stays shared vs per-tenant
| Concern | Shared singleton | Per-tenant (dedicated) |
|---|---|---|
| Core DB, object storage, core-api | β always shared | β |
| Business data | region shared DB (tenant_id-scoped) | tenant_<uuid> DB |
| Compute (tenant-api) | pooled instance | dedicated instance |
| BullMQ queues | shared Redis, one namespace | shared Redis, distinct REDIS_PREFIX |
| For strict residency | β | in-region DB host + in-region cluster (seam to wire) |
Key source paths: src/tenant-api/src/database/connection.ts, src/tenant-api/src/database/init.ts, src/tenant-api/src/config/index.ts, src/tenant-api/drizzle/, src/core-api/src/database/schema.ts.