S-57 Data Import Workflow Guide¶
Overview¶
The S-57 Data Import Tool (scripts/import_s57.py) is a production-grade command-line utility for converting S-57 Electronic Navigational Chart (ENC) data into GIS-ready formats. It supports three distinct conversion modes and multiple output backends (PostGIS, GeoPackage, SpatiaLite) with comprehensive validation, benchmarking, and update capabilities.
What It Does¶
The tool performs automated S-57 conversion with the following features:
-
Three Conversion Modes
- Base Mode: One-to-one bulk conversion (each ENC → separate output file/schema)
- Advanced Mode: Layer-centric conversion (all ENCs merged by layer with ENC source tracking)
- Update Mode: Incremental or force updates to existing datasets
-
Multiple Output Formats
- PostGIS: Server-based database with schema management
- GeoPackage: SQLite-based portable format (
.gpkg) - SpatiaLite: SQLite-based spatial format (
.sqlite)
-
Quality Assurance
- Pre-flight validation (environment, paths, database connectivity)
- Post-conversion verification (layer sampling, feature count validation)
- DSID stamping verification (Advanced mode only)
- Performance benchmarking
-
Advanced Features
- Parallel file processing for faster conversions
- Automatic batch size tuning based on available memory
- Incremental update tracking with change summaries
- Comprehensive logging and error reporting
Prerequisites¶
Required Software¶
- Python 3.11+
- GDAL 3.10.3
- PostgreSQL with PostGIS extension (for PostGIS output)
- All dependencies listed in
environment.ymlandrequirements.in
Required Data¶
- S-57 ENC files (
.000base files, scanned recursively) - For update mode: additional ENC directory with updated charts
Database Setup (PostGIS)¶
Ensure PostgreSQL is running:
# Check PostgreSQL service
sudo systemctl status postgresql
# Test connection
psql -h localhost -U postgres -d postgres -c "SELECT version();"
# Create database if needed
createdb -h localhost -U postgres enc_db
psql -h localhost -U postgres -d enc_db -c "CREATE EXTENSION IF NOT EXISTS postgis;"
Installation & Setup¶
1. Clone/Download the Project¶
2. Install Dependencies¶
mamba env update -f environment.yml --prune
pip install uv
uv pip compile requirements.in -o requirements.txt # Optional: skip to use tested snapshot
uv pip install --no-deps -r requirements.txt
uv pip install -e .
3. Configure Database Credentials¶
Create or edit .env file:
# .env
DB_NAME="enc_db"
DB_USER="postgres"
DB_PASSWORD="your_secure_password"
DB_HOST="127.0.0.1"
DB_PORT="5432"
Or pass credentials via command-line arguments:
--db-name enc_db --db-user postgres --db-password your_secure_password --db-host 127.0.0.1 --db-port 5432
Usage Guide¶
Basic Command Structure¶
python scripts/import_s57.py \
--mode {base|advanced|update} \
--input-path /path/to/enc/data \
--output-format {postgis|gpkg|spatialite} \
[additional options]
Mode 1: Base Conversion (One-to-One)¶
Convert each S-57 file to a separate output (simplest mode):
To GeoPackage¶
python scripts/import_s57.py \
--mode base \
--input-path data/ENC_ROOT \
--output-format gpkg \
--output-dir output/by_enc_gpkg
To PostGIS¶
python scripts/import_s57.py \
--mode base \
--input-path data/ENC_ROOT \
--output-format postgis \
--db-host 127.0.0.1 \
--db-user postgres \
--db-password your_secure_password
Output: Separate schema/file for each ENC (US1WC01M, US1EEZ1M, etc.)
Mode 2: Advanced Conversion (Layer-Centric)¶
Merge all ENCs by layer with source tracking (recommended for analysis):
To PostGIS with Verification¶
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema enc_west \
--verify \
--db-host 127.0.0.1 \
--db-user postgres
To GeoPackage with Parallel Processing¶
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format gpkg \
--schema enc_west \
--output-dir output \
--enable-parallel \
--max-workers 4 \
--verify
With Custom Batch Size¶
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema enc_west \
--batch-size 500 \
--memory-limit-mb 2048
Output: Single merged dataset with dsid_dsnm column tracking source ENC
Mode 3: Update Existing Dataset¶
Update charts in an existing database:
Incremental Update¶
python scripts/import_s57.py \
--mode update \
--update-source data/ENC_ROOT_UPDATE \
--output-format postgis \
--schema enc_west \
--db-host 127.0.0.1 \
--db-user postgres
Force Update (Clean Install)¶
python scripts/import_s57.py \
--mode update \
--update-source data/ENC_ROOT_UPDATE \
--output-format postgis \
--schema enc_west \
--force-update
Force Update Specific ENCs¶
python scripts/import_s57.py \
--mode update \
--update-source data/ENC_ROOT_UPDATE \
--output-format postgis \
--schema enc_west \
--force-update \
--enc-filter US3CA52M US1GC09M US1PO02M
Advanced Options¶
Performance Tuning¶
Parallel Processing¶
--enable-parallel # Enable multi-worker processing
--max-workers 4 # Number of parallel workers (default: 2)
Best for: Multiple ENCs, sufficient RAM (>2GB)
Memory Management¶
--memory-limit-mb 2048 # Total memory limit (default: 1024)
--target-memory-mb 512 # Target per-batch usage (default: 512)
--no-auto-tune # Disable automatic batch size tuning
--batch-size 500 # Manual batch size override
When to use:
- High memory systems: increase
--memory-limit-mb - Limited memory: decrease batch size or reduce workers
- Disable auto-tune only if you know optimal settings
Validation & Reporting¶
--verify # Run post-conversion verification
--skip-validation # Skip pre-flight validation checks
--benchmark-output results.csv # Save performance metrics
--dry-run # Validate config without executing
Behavior Control¶
--overwrite # Overwrite existing outputs
--verbose # Enable debug logging (very detailed)
--quiet # Minimal output (warnings/errors only)
Log File¶
All runs automatically create s57_import.log with full details.
Complete Examples¶
Example 1: Quick Start (Base Mode to GeoPackage)¶
python scripts/import_s57.py \
--mode base \
--input-path data/ENC_ROOT \
--output-format gpkg \
--output-dir output/encs
Expected output:
- Multiple
.gpkgfiles (one per ENC) - Quick conversion (< 10 minutes for typical dataset)
- No database required
Example 2: Production Advanced Conversion¶
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema enc_west \
--enable-parallel \
--max-workers 4 \
--memory-limit-mb 4096 \
--verify \
--benchmark-output benchmarks.csv \
--db-host 127.0.0.1 \
--db-user postgres \
--db-password your_secure_password
Expected results:
- Single PostGIS schema
us_enc_allwith merged layers - All features stamped with source ENC (
dsid_dsnmcolumn) - Performance metrics saved to
benchmarks.csv - Post-conversion verification report
Example 3: Update Workflow¶
# Initial import
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema us_enc_latest \
--verify
# Later: update with new charts
python scripts/import_s57.py \
--mode update \
--update-source data/ENC_UPDATES_2025 \
--output-format postgis \
--schema us_enc_latest \
--db-host 127.0.0.1 \
--db-user postgres
Example 4: Testing with Dry-Run¶
# Validate configuration before execution
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema test_import \
--dry-run
# If successful, run actual conversion
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema test_import
Example 5: High-Performance Parallel Conversion¶
python scripts/import_s57.py \
--mode advanced \
--input-path data/ENC_ROOT \
--output-format postgis \
--schema enc_west \
--enable-parallel \
--max-workers 8 \
--memory-limit-mb 8192 \
--batch-size 1000 \
--verbose \
--benchmark-output perf_results.csv
Output Overview¶
Base Mode Outputs¶
GeoPackage: One file per ENC
output/by_enc_gpkg/
├── US1WC01M.gpkg # ~150 MB
├── US1EEZ1M.gpkg # ~80 MB
├── US1GC09M.gpkg # ~120 MB
└── ...
PostGIS: One schema per ENC
Advanced Mode Outputs¶
Single Merged Dataset
PostGIS Schema: enc_west
- lndmrk (landmarks)
- seaare (sea areas)
- soundg (soundings/depth)
- boyspp (buoys)
- lndare (land areas)
- ... (all S-57 layers)
All features include:
- dsid_dsnm: Source ENC name
- All original S-57 attributes
GeoPackage/SpatiaLite:
Verification Output¶
When --verify is used, you see:
POST-CONVERSION VERIFICATION
Testing key layers:
✓ 'lndmrk': 2,451 features
✓ 'seaare': 8,923 features
✓ 'soundg': 156,789 features
✓ 'boyspp': 1,234 features
Verifying feature update status (DSID stamping)...
✓ Feature update status verified
Benchmark Output (CSV)¶
timestamp,mode,output_format,input_path,duration_sec,schema
2025-10-31T14:23:10.123456,advanced,postgis,data/ENC_ROOT,245.67,us_enc_all
2025-10-31T15:45:32.654321,advanced,postgis,data/ENC_ROOT,238.92,us_enc_all
Use for performance tracking across runs.
Performance Expectations¶
Typical Execution Times¶
| Conversion | Dataset Size | Mode | Time | Notes |
|---|---|---|---|---|
| Base | 5 ENCs | base | 5-10 min | One-to-one, simple |
| Advanced | 5 ENCs | advanced | 15-25 min | Layer merge, indexing |
| Advanced | 5 ENCs | advanced + parallel | 10-15 min | 4 workers, merged |
| Update | 2 ENCs | update | 3-5 min | Incremental only |
Factors Affecting Speed¶
- Number of ENCs: More files = longer processing
- Output Format:
- PostGIS: Medium (database I/O)
- GeoPackage: Faster (file-based)
- SpatiaLite: Fastest (simple SQLite)
- Parallel Workers: 2-4 optimal; 8+ may reduce efficiency
- Memory: More memory allows larger batches
- Verification: Adds ~5-10 minutes for detailed checks
Performance Tips¶
- First run: Base mode fastest for testing
- Parallel processing: Enable for 5+ ENCs
- Batch size: Let auto-tuning handle it (remove
--batch-size) - Memory: Use 70-80% of available RAM for
--memory-limit-mb - PostGIS: Ensure database is on local/fast network
Troubleshooting¶
Issue 1: Database Connection Error¶
Solution:
# Check PostgreSQL is running
sudo systemctl status postgresql
# Test connection manually
psql -h 127.0.0.1 -U postgres -d postgres
# Verify credentials in .env
cat .env | grep DB_
Issue 2: GDAL Not Found¶
Solution:
# Reinstall GDAL (exact version required - see WORKFLOW_QUICKSTART.md)
# Using conda for better compatibility
mamba install gdal
# Verify installation
python -c "from osgeo import gdal; print(gdal.__version__)"
Issue 3: No S-57 Files Found¶
Solution:
- S-57 files must have
.000extension (base file) - Check directory exists and contains ENCs:
- Ensure read permissions:
ls -la /path/to/data
Issue 4: Out of Memory Error¶
Solution:
# Reduce batch size
python scripts/import_s57.py ... --batch-size 250
# Reduce workers
python scripts/import_s57.py ... --enable-parallel --max-workers 2
# Increase available memory (system dependent)
# Or reduce --memory-limit-mb if auto-tune is causing issues
Issue 5: Schema Already Exists¶
Solution:
# Option 1: Use --overwrite to replace
python scripts/import_s57.py ... --overwrite
# Option 2: Use different schema name
python scripts/import_s57.py ... --schema enc_west_v2
# Option 3: Drop existing schema (CAUTION!)
# psql -h 127.0.0.1 -U postgres -d enc_db -c "DROP SCHEMA enc_west CASCADE;"
Issue 6: Verification Shows Missing Layers¶
Solution:
- Not all ENCs contain all layers (normal)
- Check source data has those layer types:
- If truly missing, verify ENC files aren't corrupted
Debugging Steps¶
-
Enable verbose logging:
-
Check generated log file:
-
Dry-run validation:
-
Verify PostGIS after import:
-
Check feature counts:
Advanced Topics¶
Custom GDAL Configuration¶
The tool automatically configures GDAL S-57 settings:
RETURN_PRIMITIVES=OFF(return geometries, not primitives)SPLIT_MULTIPOINT=ON(separate multipoint features)ADD_SOUNDG_DEPTH=ON(extract depth from soundings)UPDATES=APPLY(apply update records)LNAM_REFS=ON(maintain spatial references)RETURN_LINKAGES=ON(return spatial linkages)RECODE_BY_DSSI=ON(recode by data source)
No user configuration needed; these are set automatically.
Batch Size Tuning¶
Auto-tuning works as follows:
- System detects available RAM
- Allocates % for batch processing
- Dynamically adjusts batch size per ENC
- Disabled if
--batch-sizeis specified manually
Disable auto-tuning only if you have specific requirements:
Parallel Processing Safety¶
Parallel mode is read-only safe:
- Multiple workers read different ENCs simultaneously
- Write operations still serialized (prevents corruption)
- Validation level set to
strictfor safety - No data loss or consistency issues
Safe to use with confidence.
Incremental vs Force Updates¶
Incremental Update (--mode update without --force-update):
- Compares timestamps with existing data
- Only updates modified ENCs
- Faster for periodic updates
- Preserves unmodified data
Force Update (--mode update --force-update):
- Removes old data for specified ENCs
- Reimports from source
- Slower but ensures clean state
- Use when data corruption suspected
Source Tracking (dsid_dsnm)¶
In Advanced mode, each feature includes dsid_dsnm (data source name):
-- Find features from specific ENC
SELECT * FROM enc_west.seaare
WHERE dsid_dsnm = 'US1WC01M';
-- Count features per ENC
SELECT dsid_dsnm, COUNT(*)
FROM enc_west.soundg
GROUP BY dsid_dsnm
ORDER BY COUNT(*) DESC;
Useful for:
- Auditing which ENC contributed features
- Identifying update sources
- Validating data completeness
Comparison: Base vs Advanced¶
| Aspect | Base | Advanced |
|---|---|---|
| Output structure | Separate per ENC | Single merged |
| File count | Many (one/ENC) | One (or one/format) |
| Layer merge | No | Yes, with tracking |
| Query complexity | Simple (single schema) | Moderate (single table) |
| Source tracking | File name | dsid_dsnm column |
| Update capability | Manual | Automatic |
| Best for | Quick testing | Production use |
| Speed | Faster | Slower (more processing) |
Next Steps¶
After Successful Import¶
-
Verify data quality:
-
Analyze coverage:
-
Create visualizations:
- Open GeoPackage in QGIS
- Use
docs/notebooks/layers_inspect.ipynb - Export to web-compatible format
-
Build routing graphs:
- See
WORKFLOW_POSTGIS_GUIDE.md - Use imported data for maritime graph creation
- See
-
Schedule updates:
- Monitor NOAA ENC updates
- Run update workflow periodically
- Track changes with benchmarks
Related Documentation¶
- Script:
scripts/import_s57.py - Setup:
docs/getting-started/workflow-quickstart.md - PostGIS Workflow:
docs/user-guides/workflow-postgis-guide.md - GeoPackage Workflow:
docs/user-guides/workflow-geopackage-guide.md - Notebooks:
docs/notebooks/import_s57.ipynb- Detailed examplesdocs/notebooks/layers_inspect.ipynb- Layer analysisdocs/notebooks/s57utils.ipynb- Utility functions
Support & Feedback¶
For issues, questions, or improvements:
- Check logs:
tail s57_import.log - Run with
--verbosefor debug details - Verify environment:
python scripts/import_s57.py --dry-run - See troubleshooting section above