Skip to content

Troubleshooting

Solutions to common issues when using sql-splitter.

The binary isn’t in your PATH.

Cargo install location:

Terminal window
# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
export PATH="$HOME/.cargo/bin:$PATH"
# Then reload
source ~/.bashrc # or ~/.zshrc

Verify installation:

/Users/you/.cargo/bin/sql-splitter
which sql-splitter

Build fails with “linker cc not found”

Section titled “Build fails with “linker cc not found””

Missing C compiler on Linux.

Terminal window
# Ubuntu/Debian
sudo apt-get install build-essential
# Fedora/RHEL
sudo dnf install gcc
# Then retry
cargo install sql-splitter

Ensure you have Xcode command line tools:

Terminal window
xcode-select --install

sql-splitter analyzes the first ~1000 lines to detect the dialect. If your dump starts with generic SQL, detection may fail.

Solution: Explicitly specify the dialect:

Terminal window
sql-splitter split dump.sql -o output/ --dialect mysql
sql-splitter split dump.sql -o output/ --dialect postgres

If auto-detection picks the wrong dialect:

Terminal window
# Force specific dialect
sql-splitter analyze dump.sql --dialect postgres
# Check what was detected
sql-splitter analyze dump.sql --json | jq '.dialect'

”memory allocation failed” or OOM killed

Section titled “”memory allocation failed” or OOM killed”

This shouldn’t happen with normal sql-splitter commands (they use ~50MB constant memory). If it does:

  1. Check for memory-intensive commands: validate and diff with FK checks can use more memory on very large files.

  2. Disable FK checks for validation:

    Terminal window
    sql-splitter validate huge.sql --no-fk-checks
  3. Use disk mode for query:

    Terminal window
    sql-splitter query huge.sql "SELECT ..." --disk
  4. Limit rows per table:

    Terminal window
    sql-splitter validate huge.sql --max-rows-per-table 100000
Terminal window
# Check file exists
ls -la dump.sql
# Use absolute path
sql-splitter split /full/path/to/dump.sql -o output/
# Check permissions
chmod +r dump.sql
Terminal window
# Check output directory is writable
mkdir -p output/
chmod +w output/
# Or write to a different location
sql-splitter split dump.sql -o ~/output/

Ensure the file extension matches the compression format:

FormatExtension
Gzip.gz
Bzip2.bz2
XZ.xz
Zstandard.zst
Terminal window
# Rename if needed
mv dump.sql.gzip dump.sql.gz
# Then process
sql-splitter analyze dump.sql.gz

The dump file might not contain recognizable SQL statements.

Check file contents:

Terminal window
head -100 dump.sql

Common causes:

  • Binary format (use pg_dump -Fp for plain text)
  • Non-SQL file
  • Empty file
  • Encoding issues (see below)

sql-splitter expects UTF-8 encoding.

Convert encoding:

Terminal window
# Check current encoding
file dump.sql
# Convert from Latin-1 to UTF-8
iconv -f ISO-8859-1 -t UTF-8 dump.sql > dump-utf8.sql
# Convert from Windows-1252
iconv -f CP1252 -t UTF-8 dump.sql > dump-utf8.sql

Multi-line strings or unusual quoting can cause issues.

Workaround: Try a different dialect:

Terminal window
# PostgreSQL uses different escaping than MySQL
sql-splitter split dump.sql --dialect postgres

”Duplicate primary key” false positives

Section titled “”Duplicate primary key” false positives”

If you’re validating a dump with intentional duplicates (e.g., for testing):

Terminal window
# Skip PK/FK checks
sql-splitter validate dump.sql --no-fk-checks

PK/FK validation reads all data. Speed it up:

Terminal window
# Limit rows checked per table
sql-splitter validate dump.sql --max-rows-per-table 10000
# Skip FK checks entirely
sql-splitter validate dump.sql --no-fk-checks

Common causes:

  1. SQL syntax: DuckDB uses standard SQL, not MySQL/PostgreSQL extensions

    -- MySQL LIMIT with offset
    SELECT * FROM users LIMIT 10, 5 -- Won't work
    -- Standard SQL
    SELECT * FROM users LIMIT 5 OFFSET 10 -- Works
  2. Column name conflicts: Use quotes for reserved words

    SELECT "order", "user" FROM orders
  3. Large file: Use disk mode

    Terminal window
    sql-splitter query huge.sql "SELECT ..." --disk
Terminal window
# Clear corrupted cache
sql-splitter query --clear-cache
# List cached databases
sql-splitter query --list-cache

Some SQL features don’t have direct equivalents across dialects:

  • ENUM types → Converted to VARCHAR with CHECK constraint
  • AUTO_INCREMENT → Converted to SERIAL (PostgreSQL) or IDENTITY (MSSQL)
  • Stored procedures → Skipped (out of scope)

Use --strict to fail on unsupported features instead of warning:

Terminal window
sql-splitter convert dump.sql --to postgres --strict

PostgreSQL COPY FROM stdin is converted to INSERT statements:

Terminal window
# This works
sql-splitter convert pg_dump.sql --to mysql -o mysql.sql

If you see raw COPY blocks in output, ensure dialect was detected correctly:

Terminal window
sql-splitter convert pg_dump.sql --from postgres --to mysql -o mysql.sql

Check that tables exist:

Terminal window
sql-splitter analyze dump.sql --json | jq '.tables[].name'

Ensure required flags are set:

Terminal window
# sample requires --percent OR --rows
sql-splitter sample dump.sql --percent 10 -o sample.sql
# shard requires --tenant-value OR --tenant-values
sql-splitter shard dump.sql --tenant-value 123 -o tenant.sql

If related rows are missing:

Terminal window
# Enable FK preservation
sql-splitter sample dump.sql --percent 10 --preserve-relations -o sample.sql
# Use strict mode to catch issues
sql-splitter sample dump.sql --percent 10 --preserve-relations --strict-fk -o sample.sql

Most commands support --progress for visibility:

Terminal window
sql-splitter split dump.sql -o output/ --progress

Use --json for machine-readable output you can inspect:

Terminal window
sql-splitter analyze dump.sql --json | jq '.'

If you’ve found a bug:

  1. Check existing issues
  2. Create a new issue with:
    • sql-splitter version (sql-splitter --version)
    • OS and architecture
    • Minimal reproduction steps
    • Sample SQL (anonymized if needed)