Skip to content

shard

Extract data for specific tenants from multi-tenant SQL dumps.

Alias: sh (e.g., sql-splitter sh dump.sql --tenant-value 123 -o tenant.sql)

  • Tenant data export - Extract a single customer’s data for migration or analysis
  • Development environments - Create small, focused dev databases from a multi-tenant production dump
  • Data isolation - Separate tenant data for compliance, backup, or legal requirements
  • Debugging - Reproduce issues with a specific tenant’s data without the full database

Use sample instead if you want a random subset of data across all tenants.

Sharding follows your database’s foreign key relationships to extract complete, consistent data:

  1. Identifies tenant column - Auto-detects tenant_id, company_id, org_id, etc., or uses your explicit --tenant-column
  2. Starts from root tables - Tables with the tenant column that have no incoming foreign keys
  3. Follows FK relationships - Recursively includes all related rows that belong to the tenant
  4. Includes lookup tables - Optionally includes global/shared tables (countries, currencies, etc.)

The result is a self-contained dump that maintains referential integrity.

Terminal window
sql-splitter shard <INPUT> --tenant-value <VALUE> [OPTIONS]
Terminal window
# Extract all data for tenant 123
sql-splitter shard dump.sql --tenant-value 123 -o tenant_123.sql
# Specify the tenant column explicitly
sql-splitter shard dump.sql --tenant-column account_id --tenant-value 42 -o out.sql

Extract multiple tenants to separate files in one pass:

Terminal window
sql-splitter shard dump.sql --tenant-values "1,2,3" -o shards/

This creates:

shards/
├── tenant_1.sql
├── tenant_2.sql
└── tenant_3.sql
Terminal window
# Include all shared/lookup tables (countries, currencies, config)
sql-splitter shard dump.sql --tenant-value 123 --include-global all -o tenant.sql
# Skip global tables entirely
sql-splitter shard dump.sql --tenant-value 123 --include-global none -o tenant.sql
# Only include small lookup tables (default behavior)
sql-splitter shard dump.sql --tenant-value 123 --include-global lookups -o tenant.sql
Terminal window
# See what would be extracted without writing files
sql-splitter shard dump.sql --tenant-value 123 --dry-run
FlagShortDescriptionDefault
--output-oOutput SQL file or directorystdout
--dialect-dSQL dialectauto-detect
--tenant-columnColumn name for tenant identificationauto-detect
--tenant-valueSingle tenant value to extract-
--tenant-valuesMultiple tenant values (comma-separated)-
--root-tablesExplicit root tables with tenant columnauto
--include-globalGlobal table handling: none, lookups, alllookups
--config-cYAML config file for table classification-
--max-selected-rowsMaximum rows to selectunlimited
--no-limitDisable row limitfalse
--strict-fkFail if any FK integrity issuesfalse
--no-schemaExclude CREATE TABLE statementsfalse
--progress-pShow progress barfalse
--dry-runPreview without writingfalse
--jsonOutput as JSONfalse

sql-splitter auto-detects common tenant columns:

  • tenant_id
  • company_id
  • account_id
  • org_id
  • organization_id
  • workspace_id

Override with --tenant-column:

Terminal window
sql-splitter shard dump.sql --tenant-column customer_id --tenant-value 42

Extract multiple tenants to separate files:

Terminal window
sql-splitter shard dump.sql --tenant-values "1,2,3" -o shards/

Creates:

shards/
├── tenant_1.sql
├── tenant_2.sql
└── tenant_3.sql

The --include-global option controls shared/lookup tables:

  • none: Skip global tables
  • lookups: Include small lookup tables (default)
  • all: Include all non-tenant tables

Sharding follows foreign key relationships:

  1. Finds rows matching tenant value in root tables
  2. Follows FKs to include related rows
  3. Ensures complete, consistent tenant slice

Auto-detection looks for common column names. Specify yours explicitly:

Terminal window
sql-splitter shard dump.sql --tenant-column customer_id --tenant-value 123 -o tenant.sql

Foreign key relationships might not be detected if:

  • They’re not defined in the schema (implicit FKs)
  • The FK references a table that’s excluded

Use --root-tables to explicitly specify which tables have the tenant column:

Terminal window
sql-splitter shard dump.sql --tenant-value 123 --root-tables users,projects,teams -o tenant.sql

Check which global tables are being included:

Terminal window
# See what would be extracted
sql-splitter shard dump.sql --tenant-value 123 --dry-run --json | jq '.tables'
# Exclude large lookup tables
sql-splitter shard dump.sql --tenant-value 123 --include-global none -o tenant.sql

Some rows might reference data outside the tenant’s scope. Options:

  1. --strict-fk: Fail immediately on any integrity issue
  2. Review warnings: Some cross-tenant references might be intentional (shared resources)
  3. Include global tables: --include-global all includes all referenced lookup data