Virtualize
Create isolated VDBs from production snapshots — no full copy, no copy tickets, no waiting.
SOFI creates masked, production-like databases inside your own network — for dev, QA, analytics and compliance — without copying sensitive data into an external SaaS. Runs in your VPC or on-premise.
// provisioning flow
// sample row · masked VDB
// trusted lower-environment data
Private deploy
VPC / on-premGovernance
RBAC + maskingAudit
Every refresh loggedAutomation
CI/CD + API// For teams who build //
Connect a source, profile the PII, apply masking, and provision isolated test databases on demand.
Connect
prod.customersProfile
pii.detectedMask
pii.defaultProvision
customer_360_qa// provisioning flow
Every VDB keeps its source, snapshot, masking policy and owner. The same definition feeds psql/pgwire, JDBC/ODBC, REST and your CI pipelines.
// schema preview
from sofi import Sofi
app = Sofi(api_key="YOUR_KEY")
vdb = app.provision(
source="prod.customers",
masking="pii_default",
engine="postgres",
)
print(vdb.connection_uri)// what SOFI does //
Virtualize, mask, refresh, connect, snapshot and operate — one platform inside your network.
Create isolated VDBs from production snapshots — no full copy, no copy tickets, no waiting.
Replace PII with realistic values while keeping joins, constraints, formats and totals intact.
Keep dev, QA, demo and analytics current without re-running slow export-and-sanitize jobs.
Postgres, MySQL, SQL Server, Oracle, MongoDB, ClickHouse, Snowflake, BigQuery and more.
Copy-on-write snapshots with block dedup — lock, rewind, undo and time-travel any VDB.
Run inside your VPC or bare metal with SSO, RBAC, audit logs and private routing.
// CI/CD ready //
The same masked VDB from the API, the CLI or a GitHub Action — RBAC, masking and audit on every call.
$ sofi login --host private.sofi.local
$ sofi vdb create \
--source prod.customers \
--masking pii_default \
--name customer_360_qa
✓ snapshot ready 1.2M rows
✓ masking applied 3 PII columns
✓ vdb provisioned customer_360_qa
→ psql "host=private.sofi.local dbname=customer_360_qa"// every environment
REST, CLI and CI share one plan: every environment inherits the same policy, with no duplicated logic.
JWT or API key resolves the requester and tenant.
Policy by role, scope, purpose and environment.
PII masked per column before the snapshot clones.
Requester, source and rows recorded on every refresh.
avg provision · <90 s per environment
// production-ready //
Copy-on-write snapshots, block dedup and hyperscale masking that handles 100M+ rows without OOM.
new environment
<0seconds
storage per VDB
~0% of source
data movement
0bytes leave network
audit + lineage
0% coverage
// benchmark
Standing up a masked dev database from prod.customers with PII rules applied.
// why it's fast
Virtual clones share blocks; only changed data costs storage.
SHA-256, 4MB blocks, zstd — snapshots stay small and fast.
Tables split into chunks, fanned out via Celery for 100M+ rows.
RBAC, masking and audit live in the provisioning path, not a wrapper.
// governed by default //
Masking, access, lineage, audit and lifecycle in a single operational layer.
// Sources to VDBs
Virtual clone snapshots so many environments share one masked footprint instead of full duplicates.
// Masked at the source
Deterministic, format-preserving masking applied to the snapshot before any team touches it.
// Cross-source
The same email maps to the same fake value across every database, so joins keep working.
// RBAC by default
Role, tenant and purpose decide who can provision, refresh or export which environment.
// Every refresh
Each provision, refresh and access becomes a structured event with role, scope and rows.
// One surface
Lock a snapshot, rewind a VDB, undo a refresh or revoke an environment without redeploys.
// use cases //
Ready-made patterns to deliver realistic data with control, reuse and traceability.
Give every developer and tester an isolated, production-like database without copy tickets or exposed PII.
postgres · read-only
CoW · masked
34 · isolated
// shared snapshot
One masked snapshot, many virtual clone environments for dev, QA and demos.
// status
Production-like, masked copies for every engineer and tester, on demand.
1 snapshot · N VDBs
Ephemeral per-PR databases that spin up fast and tear down on merge.
< 90 s · per-PR
Refresh sandbox and demo data on a schedule without ETL round-trips.
nightly · scheduled
Mask PII, audit every refresh and prove lineage for lower environments.
100% audit coverage
Validate Oracle → Postgres or version upgrades against masked clones first.
0 downtime
Give partners and support realistic data that never exposes real customers.
safe by default
// connect everything //
38+ connectors with read-only source access, plus CI/CD, identity and observability.
Relational · 12
Distributed · 8
Warehouses · 8
NoSQL, Graph & Search · 6
Streaming & cache
Files, CI & automation
npx @sofi/cli init
sofi vdb create --source prod.customers
uses: sofi/sofi-provision@v1
// data teams //
What engineering, QA and governance gain after the first SOFI deployment.
“We replaced a three-day copy-and-sanitize job with a masked snapshot that clones in seconds. Same governance, no nightly batch.”
“Every pull request now gets its own masked database. QA stopped sharing one stale environment overnight.”
“Our LGPD audit went from a quarterly spreadsheet to a SQL query with evidence ready.”
| // capability | SOFI | Legacy TDM | DIY scripts |
|---|---|---|---|
| Masked, production-like data in lower environments | ✓ | ✓ | Custom |
| Virtual clone VDBs instead of full copies | ✓ | Partial | — |
| Deterministic, cross-source masking | ✓ | Limited | — |
| Self-hosted private deployment | ✓ | Add-on | ✓ |
| Built-in LGPD / RBAC / audit trail | ✓ | Partial | — |
| Ephemeral DBs in CI (REST + CLI + Action) | ✓ | — | Custom |
| Time-to-first-environment | < 2 weeks | 3-6 months | 6-12 months |
| Storage footprint per environment | ~2% | 100% | 100% |
// enterprise only //
The pricing conversation is about deployment, risk, governance and success criteria — not picking between smaller SaaS tiers.
// enterprise
Annual contract for organizations that need masked, governed test data in regulated, private or large-scale environments.
Talk to sales →Virtual clone VDBs from snapshots, no full copies to another store.
Deterministic PII masking, RBAC, audit and LGPD flows in the provisioning path.
Copy-on-write snapshots, lock, rewind, undo and subset.
REST, psql/pgwire, JDBC/ODBC, CLI and the CI GitHub Action.
SSO, SAML/OIDC, SCIM, custom domains and tenant controls.
Private VPC, on-prem, air-gapped or managed enterprise cloud.
// start now
Stand up a masked, production-like database over your own data in under two weeks. Nothing leaves your environment and every refresh is auditable by design.
No credit card · Private deployment available
// FAQ //
The essentials for evaluating the SOFI deployment model.
No. SOFI runs inside your VPC, private cloud or bare metal. Source access is read-only and masked data never leaves your environment.
Masking is deterministic and format-preserving: the same input always maps to the same realistic fake value, so joins, constraints, totals and formats stay intact across every database.
Most VDBs provision in under two minutes from an existing masked snapshot, using copy-on-write virtual clones that take about 2% of the source footprint.
PostgreSQL, MySQL, MariaDB, SQL Server, Oracle, DB2, MongoDB, Cassandra, ClickHouse, Snowflake, BigQuery, Redshift, Databricks and more — plus CSV, JSON and Parquet files.
Use the REST API, the sofi CLI or the GitHub Action to provision an ephemeral masked database per pull request and tear it down automatically on merge.
Column masking, row-level controls, RBAC, encrypted credentials and a queryable audit trail are applied before any environment is handed to a team.