Troubleshooting Guide¶
This guide covers common issues you may encounter when working with the Nautical Graph Toolkit and their solutions.
Table of Contents¶
- Windows PowerShell & Mamba Issues ⚠️ Windows Users
- PyCharm Conda Integration Error
- Jupyter Kernel Issues
- SQLite RTREE Issues ⚠️ Most Common
- GeoPackage File I/O Issues
- Environment Setup Issues
- Documentation Build Issues
- GDAL/PROJ Database Warnings
- Port Selection Issues
- Database Connection Issues
- VACUUM ANALYZE fails with "No space left on device" (Docker shm)
- Buffer Zone TopologyException ⚠️ PostGIS Complex Coastlines
- Data Source Issues
- S57Updater: File-Based Backend Safety ⚠️ Important
- Graph Creation Issues
- Performance Issues
- Visualization Issues
- Pathfinding Issues
Windows PowerShell & Mamba Issues¶
Issue: Mamba/Conda works in Command Prompt but fails in PowerShell¶
Platform: Windows only
Symptoms:
mambacommands work in Command Prompt (cmd.exe)- PowerShell shows:
The term 'mamba' is not recognized as the name of a cmdlet... - Environment exists but
mamba activatefails with prefix errors
Root Cause: The Mamba/Conda installer initialized Command Prompt but did not write PowerShell startup scripts.
Part 1: Fix "The term 'mamba' is not recognized"¶
Symptoms:
mamba : The term 'mamba' is not recognized as the name of a cmdlet, function, script file,
or operable program. Check the spelling of the name, or if a path was included, verify the
path is correct and try again.
The Fix:
- Open Command Prompt (cmd.exe) (not PowerShell)
- Run the initialization command:
:: Option A: If mamba is already working in CMD
mamba shell init --shell powershell --root-prefix "C:\Users\<YourUser>\miniforge3"
:: Option B: Using the full path (replace <YourUser> with your username)
"C:\Users\<YourUser>\miniforge3\Library\bin\mamba.exe" shell init --shell powershell --root-prefix "C:\Users\<YourUser>\miniforge3"
- Close the terminal and open a new PowerShell window
Part 2: Fix "Running scripts is disabled on this system"¶
Symptoms:
File C:\Users\<YourUser>\miniforge3\shell\condabin\conda-hook.ps1 cannot be loaded because
running scripts is disabled on this system. For more information, see about_Execution_Policies
at https:/go.microsoft.com/fwlink/?LinkID=135170.
The Fix:
- Open PowerShell as Administrator (Right-click > Run as Administrator)
- Run the following command:
- Type
Yand press Enter if prompted
Why this is safe:
RemoteSignedallows local scripts (like Mamba's) to run- Only scripts downloaded from the internet need signing
- This is the recommended policy for development work
Part 3: Fix "Cannot activate, prefix does not exist"¶
Symptoms:
critical libmamba Cannot activate, prefix does not exist at: 'C:\Users\...\miniforge3\envs\nautical'
But mamba env list shows:
Root Cause: Mamba is looking for environments in the default folder (miniforge3\envs), but your environment is stored in a user directory (.local\share\mamba\envs).
The Fix:
Add the hidden directory to the configuration search path:
# Using conda is often more reliable for config changes than mamba
conda config --append envs_dirs C:\Users\<YourUser>\.local\share\mamba\envs
# Verify activation works
mamba activate nautical
Alternative Fix (Manual Config Edit):
If the command above fails, edit the configuration file manually:
- Navigate to
C:\Users\<YourUser>\ - Open
.condarcwith a text editor (Notepad/VS Code) - Add the path under
envs_dirs:
- Save and restart PowerShell
Quick Verification Steps¶
After applying the fixes above, verify everything works:
# Test 1: Mamba is recognized
mamba --version
# Expected: mamba 1.x.x
# Test 2: Environment can be listed
mamba env list
# Expected: Should show 'nautical' environment
# Test 3: Environment can be activated
mamba activate nautical
# Expected: No error, prompt changes to (nautical)
# Test 4: Python is available
python --version
# Expected: Python 3.11.x
# Test 5: GDAL is installed
python -c "from osgeo import gdal; print(f'GDAL {gdal.__version__}')"
# Expected: GDAL 3.10.3
Cheat Sheet: CMD vs PowerShell¶
| Command | CMD | PowerShell |
|---|---|---|
| Activate | mamba activate nautical | mamba activate nautical |
| List envs | mamba env list | mamba env list |
| Install | mamba install package | mamba install package |
| Init shell | Already done by installer | Run: mamba shell init --shell powershell |
Common Windows-Specific Issues¶
Issue: Miniforge Prompt vs PowerShell
The Miniforge installer creates a "Miniforge Prompt" shortcut that pre-loads Conda/Mamba. However, you can use standard PowerShell with the fixes above.
Recommendation:
- Use Miniforge Prompt for quick setup (works out of the box)
- Use PowerShell for development (requires shell init, but better integration with Windows tools)
Issue: Path too long errors
Windows has a 260 character path limit. If you encounter:
Solution: Enable long path support in Windows 10/11:
- Run PowerShell as Administrator
- Execute:
New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force
- Restart your computer
Issue: PyCharm Conda integration error (Windows)¶
Platform: Windows only
Symptoms:
- Error appears when launching PyCharm with Miniforge/Conda environment configured
Loading personal and system profiles took 1635msfollowed by errorFileNotFoundError: conda-hook.ps1in temp directory (_MEI160602)CONDA_ROOTenvironment variable points to temp directory instead of Miniforge installation- Base environment path points to temp directory
Example error:
Loading personal and system profiles took 1635ms.
# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<
Traceback (most recent call last):
File "conda\exception_handler.py", line 18, in __call__
File "conda\cli\main.py", line 87, in main_sourced
File "conda\activate.py", line 238, in execute
File "conda\activate.py", line 220, in hook
File "pathlib.py", line 1027, in read_text
File "pathlib.py", line 1013, in open
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\<YourUser>\\AppData\\Local\\Temp\\_MEI160602\\conda\\shell\\condabin\\conda-hook.ps1'
Root Cause: PyCharm's Conda integration attempts to use Conda hooks, but the PowerShell initialization is incomplete or corrupted. The temp directory reference indicates PyCharm is using a bundled/extracted Conda executable instead of the full Miniforge installation.
The Fix:
-
Open Miniforge Prompt (or Command Prompt with Miniforge in PATH)
-
Run the PowerShell initialization:
-
Close all terminals and PyCharm
-
Restart PyCharm - The error should no longer appear
Alternative Fix (if above doesn't work):
- Open PowerShell as Administrator
-
Re-initialize Conda:
-
Restart PyCharm
Prevention:
To avoid this issue in the future:
- Always use Miniforge Prompt (not standard PowerShell) when running
conda init - Ensure Miniforge is properly installed with full PATH access
- Avoid modifying Conda environment variables manually in PyCharm settings
See also:
- PyCharm documentation: Configuring Conda
- Miniforge documentation: Installation
Jupyter Kernel Issues¶
Issue: "Nautical" kernel not found in Jupyter¶
Symptoms:
- Kernel "Nautical Toolkit" or "nautical" not available when creating notebooks
- Jupyter shows only "Python 3" or "python3" kernel
- IDE shows "No Python interpreter found" or wrong version
Cause: Jupyter kernel not created or registered incorrectly.
Solution:
-
Verify you're in the correct environment:
-
Verify Python path:
Expected paths:
- Windows:
C:\Users\<YourUser>\.local\share\mamba\envs\nautical\python.exe - Linux:
/home/<user>/miniforge3/envs/nautical/bin/python - macOS:
/Users/<user>/miniforge3/envs/nautical/bin/python
- Windows:
-
Create the kernel:
-
Verify kernel installation:
Should show:
-
If kernel still not found, remove and reinstall:
-
Restart Jupyter:
Issue: IDE shows wrong Python version for kernel¶
Symptoms:
- IDE Python interpreter points to system Python instead of Conda environment
- Packages not found even though they're installed
- Import errors:
ModuleNotFoundError: No module named 'nautical_graph_toolkit'
Solution:
-
Get the correct Python path:
-
Configure IDE with this path:
PyCharm:
- File → Settings → Project → Python Interpreter
- Click gear icon → Add
- Select "Conda Environment" → "Existing"
- Paste the path from step 1
- Apply → OK
VS Code:
- Open Command Palette (Ctrl+Shift+P)
- "Python: Select Interpreter"
- "Enter interpreter path..."
- Paste the path from step 1
- Press Enter
-
Verify in IDE:
- Create new Python file or notebook
- Run:
import sys; print(sys.executable) - Should match the path from step 1
Issue: Kernel dies immediately when starting¶
Symptoms:
- Kernel starts but immediately disconnects
- "Dead kernel" message in Jupyter
- Notebook cells won't execute
Possible Causes & Solutions:
-
Corrupted kernel specification:
-
Missing ipykernel package:
-
Conflicting Jupyter installations:
-
Check Jupyter logs for errors:
SQLite RTREE Issues¶
Issue: "no such module: rtree" error¶
Symptoms:
sqlite3.OperationalError: no such module: rtree
# or during enrichment:
✗ Enrichment failed: no such module: rtree
Cause:
- GeoPackage and SpatiaLite backends require SQLite with RTREE support
- Conda's
sqlitepackage may not be installed or environment not activated - SpatiaLite uses RTREE for spatial indexing (10-100x performance improvement)
Solution: This project uses Conda's sqlite package which provides RTREE support on all platforms (Linux, macOS ARM/Intel, Windows).
-
Verify installation:
-
Test RTREE availability:
-
If still failing:
Why this happens:
- Python's built-in sqlite3 may not have RTREE compiled in
- Conda's
sqlitepackage provides RTREE-enabled SQLite on all platforms - The
sqlitepackage is included inenvironment.ymlfor cross-platform support
Platform notes:
- Linux: Works with Conda's sqlite
- macOS (ARM & Intel): Works with Conda's sqlite (pysqlite3-binary has compatibility issues)
- Windows: Works with Conda's sqlite (pysqlite3-binary has compatibility issues)
Affected operations:
enrich_edges_with_features_gpkg_v3()apply_static_weights_gpkg()calculate_dynamic_weights_gpkg()calculate_directional_weights_gpkg()- All GeoPackage/SpatiaLite spatial queries
See also: docs/getting-started/setup.md - "SQLite RTREE Requirement" section
GeoPackage File I/O Issues¶
Issue: Pyogrio warnings during GeoPackage save/load¶
Symptoms:
RuntimeWarning: Value '(-118.212, 33.505)' of field edges.weight parsed incompletely to real 0.
# Multiple similar warnings during save_graph_to_gpkg()
Status: ✅ RESOLVED in current version
The codebase now uses the Fiona engine for GeoPackage read/write operations, which provides better fault tolerance than pyogrio:
What changed:
- All
gpd.read_file()operations now useengine='fiona'for better reliability - Initial graph writes to GeoPackage use
engine='fiona' - Append operations (
mode='a') continue using pyogrio (more stable for this operation)
Technical details:
- Read operations (10 locations):
gpd.read_file(path, layer=..., engine='fiona') - Write operations (4 locations):
gdf.to_file(path, layer=..., engine='fiona') - Append operations (6 locations):
gdf.to_file(path, layer=..., mode='a')(pyogrio default)
Why this helps:
- Fiona provides direct GDAL/OGR interface without intermediate layers
- Better handling of edge cases and type conversion
- More robust field parsing
If you still see warnings:
-
Ensure you're using the latest version:
-
Verify fiona is properly installed:
-
Check GeoPackage file integrity:
Environment Setup Issues¶
Issue: ModuleNotFoundError when importing nautical_graph_toolkit¶
Symptoms:
Solutions:
- Ensure you've installed the package:
- Verify the src directory is in your Python path:
Issue: Missing environment variables¶
Symptoms:
Solutions:
- Copy
.env.exampleto.env: - Edit
.envand fill in your actual values - Ensure
load_dotenv()is called before accessing environment variables - For Mapbox token, get one from: https://account.mapbox.com/access-tokens/
Documentation Build Issues¶
Issue: MkDocs git-revision-date plugin errors¶
Symptoms:
Root Cause: The mkdocs-git-revision-date-localized-plugin queries Git history to show file modification dates. It fails when:
- Building in non-Git directories (ZIP downloads, CI shallow clones)
- Git history isn't available or is incomplete
- Working in detached HEAD state or fresh clones
Impact: - mkdocs build or mkdocs serve fails completely - Documentation cannot be previewed locally - CI/CD pipelines may break
Solution 1: Disable the plugin (Quick Fix)¶
Set the environment variable to disable Git revision dates:
# Linux/macOS
ENABLE_GIT_REVISION=false mkdocs serve
# Windows PowerShell
$env:ENABLE_GIT_REVISION="false"; mkdocs serve
# Windows Command Prompt
set ENABLE_GIT_REVISION=false && mkdocs serve
Solution 2: Ensure Git history is available¶
If you want to keep the plugin enabled:
# Check if you're in a Git repository
git status
# If not, initialize or clone properly
git clone --depth=1 https://github.com/studentdotai/Nautical-Graph-Toolkit.git
# For full history (needed for accurate dates):
git fetch --unshallow
Solution 3: Configure MkDocs for fallback (Already Done)¶
The mkdocs.yml has been configured with graceful fallback:
- git-revision-date-localized:
enable_creation_date: true
type: date # Static dates instead of dynamic "timeago"
fallback_to_build_date: true # Uses build date when Git unavailable
enabled: !ENV [ENABLE_GIT_REVISION, true] # Can disable via env var
Benefits: - Uses Git dates when available - Falls back to build date when Git is unavailable - Can be disabled entirely with ENABLE_GIT_REVISION=false - Uses static date format instead of dynamic "timeago" (prevents unnecessary rebuilds)
Understanding the Configuration¶
Why type: date instead of type: timeago?
timeago: Shows "2 days ago", "1 month ago" - changes on every builddate: Shows "January 15, 2025" - only changes when file is actually modifieddateis better for documentation as it's stable and doesn't trigger unnecessary rebuilds
When does it use fallback dates?
- Non-Git directories
- Shallow clones (common in CI/CD)
- Files not tracked by Git
- Git errors or unavailable history
Quick Commands:
# Preview docs locally (with Git dates if available)
mkdocs serve
# Preview docs without Git dates (faster, no Git requirement)
ENABLE_GIT_REVISION=false mkdocs serve
# Build production docs with Git dates
mkdocs build
# Build without Git dates
ENABLE_GIT_REVISION=false mkdocs build
GDAL/PROJ Database Warnings¶
Issue: PROJ database path warning during GDAL operations¶
Symptoms:
ERROR 1: PROJ: proj_create_from_database: Open of /home/vikont/miniforge3/envs/nautical/share/proj failed
Status: ✅ NON-BLOCKING - All operations complete successfully
Cause:
- GDAL 3.10.3 has stricter PROJ database path requirements
- Conda/mamba environments may have multiple PROJ installations
- The
PROJ_LIBenvironment variable workaround in notebooks is incomplete for some internal GDAL operations
Impact:
- No functional impact: All notebooks run successfully despite warning
- Coordinate transformations work correctly: GDAL falls back to built-in coordinate system definitions
- Warning appears repeatedly: Once per GDAL/OGR initialization in notebook cells
- Affects all notebooks: PostGIS, GeoPackage, SpatiaLite, and utility notebooks all show this warning
Solution Options:
-
Ignore the warning (Recommended):
- All operations complete successfully
- No data corruption or incorrect coordinate transformations
- Warning can be safely ignored for development and analysis work
-
Suppress warnings in notebooks (if output noise is distracting):
-
Verify PROJ installation (diagnostic):
-
Reinstall GDAL/PROJ (if needed for other reasons):
Why this happens:
- Conda environments may have multiple PROJ versions installed across different packages
- GDAL's C library may link to system PROJ instead of Conda PROJ at runtime
- The Python-level
os.environ['PROJ_LIB']setting in notebooks doesn't affect C-level GDAL initialization
Affected operations:
- Initial GDAL/OGR driver registration during
from osgeo import ogr - Coordinate reference system initialization in notebooks
- All notebook import cells that load GDAL/GeoPandas
Version notes:
- GDAL 3.10.3: Warning appears consistently
- GDAL 3.11.3: Warning not observed (stricter path requirements relaxed)
- Upgrading to GDAL 3.11+ may eliminate warning if desired
See also:
docs/getting-started/setup.md- GDAL installation instructions- GDAL Issue Tracker: https://github.com/OSGeo/gdal/issues
GDAL 3.11+ Driver Deprecations¶
Issue: Memory driver deprecated in GDAL 3.11¶
Status: ⚠️ FUTURE ISSUE - Affects GDAL 3.11.3+
Symptoms (will occur after upgrade to GDAL 3.11+):
AttributeError: 'NoneType' object has no attribute 'CreateDataSource'
# or
OSError: Cannot open Memory driver
Background:
- In GDAL 3.11+, the
Memorydriver is deprecated - Its functionality has been merged into the
MEMdriver - The S-57 conversion pipeline uses in-memory datasets for batch processing
Affected code locations:
src/nautical_graph_toolkit/core/s57_data.py:908-ogr.GetDriverByName('Memory')
Solution (when upgrading to GDAL 3.11+):
- Change all occurrences of
'Memory'to'MEM' - Replace: With:
Current status:
- Project currently uses GDAL 3.10.3 (Memory driver still available)
- This is marked for v0.2.0+ release cycle
- A reminder comment has been added to the code
Port Selection Issues¶
Issue: Port not found error¶
Symptoms:
Solutions:
-
List all available ports to verify the correct name:
-
Check spelling and capitalization - port names are case-sensitive:
-
Search for ports by partial name:
Issue: Empty port geometry¶
Symptoms:
Solutions:
- The port was found but has missing geometry data
- Try searching for an alternative nearby port
- Check the custom_ports.csv file for data integrity
Database Connection Issues¶
Requirement: PostgreSQL Version¶
The Nautical Graph Toolkit requires PostgreSQL 16+ with PostGIS extension.
Verify your PostgreSQL version:
Issue: PostgreSQL connection failed (PostGIS)¶
Symptoms:
psycopg2.OperationalError: could not connect to server
# or
sqlalchemy.exc.OperationalError: connection refused
Solutions:
-
Verify .env file contains correct credentials:
-
Test connection manually:
-
Check PostgreSQL service is running:
-
Verify PostGIS extension is installed:
-
Check firewall/port accessibility:
Issue: Schema not found¶
Symptoms:
Solutions:
-
List available schemas:
-
Create the schema if it doesn't exist:
-
Verify you're using the correct schema name in your code:
Issue: VACUUM ANALYZE fails with "No space left on device" (Docker shm)¶
Symptoms:
WARNING - VACUUM ANALYZE failed (non-critical, autovacuum will handle it):
(psycopg2.errors.DiskFull) could not resize shared memory segment
"/PostgreSQL.4141375306" to 67128832 bytes: No space left on device
[SQL: VACUUM ANALYZE "graph"."fine_graph_open11_20_edges"]
This appears after enrich_edges_with_features_postgis() completes successfully.
Root Cause:
Despite the "No space left on device" message, this is not a disk space error. It is a Docker container shared memory exhaustion issue.
PostgreSQL uses POSIX shared memory (/dev/shm) during VACUUM operations. Docker containers have a default /dev/shm limit of 64MB, regardless of how much RAM your host machine has. The VACUUM resize request of ~64MB hits this container limit.
Diagnosis:
Check the actual shm_size of your running container:
docker inspect postgis_nautical --format '{{.HostConfig.ShmSize}}'
# 67108864 ← 64MB (Docker default)
# should be 4294967296 (4GB) if docker-compose.linux.yml was applied correctly
If the container shows 67108864 (64MB), it was started without the shm_size: 4gb setting from the compose file (e.g., started manually or before the setting was added).
Solution:
Recreate the container so the shm_size: 4gb in docker-compose.linux.yml takes effect:
The postgis_data volume is persistent — no data will be lost.
After restart, verify:
Impact:
- The enrichment itself (spatial joins, feature updates) completes successfully — VACUUM is post-processing only
- PostgreSQL's autovacuum will eventually reclaim dead tuples from the UPDATE-heavy enrichment loop
- Re-running enrichment on a restarted container will complete the VACUUM ANALYZE step
Configuration reference (docker-compose.linux.yml):
services:
db:
image: postgis/postgis:16-3.4
shm_size: 4gb # ← This sets /dev/shm inside the container
command: >
postgres
-c shared_buffers=4GB
-c maintenance_work_mem=1GB # Used by VACUUM
-c work_mem=128MB
Buffer Zone TopologyException¶
Issue: TopologyException: unable to assign free hole to a shell during buffer zone classification¶
Symptoms:
psycopg2.errors.InternalError_: lwgeom_unaryunion_prec: GEOS Error: TopologyException:
unable to assign free hole to a shell at -117.905029 33.613684999999997
Occurs during [BUFFER ZONES PostGIS] step, typically at the first ring materialization (ring_3_0).
Root Cause:
ST_SimplifyPreserveTopology collapses narrow coastal waterways (rivers, channels, harbor inlets) when the tolerance exceeds the feature width. This creates isolated interior rings ("free holes") in the land polygon — polygons with holes that have no valid parent shell in GEOS's topology graph.
The error fires inside ST_Difference(buffer, land) when GEOS tries to node the intersection of two very complex geometries. Even ST_MakeValid and ST_Buffer(geom, 0) cannot fix it because the topology break occurs internally during the boolean operation, not in the input geometries.
Affected coordinate: -117.905, 33.614 (Newport Beach / Long Beach harbor area, Southern California). Other complex coastlines with narrow channels may also trigger it.
Workaround:
Reduce simplify_tolerance in your config. The default 0.0005 (~55m) can collapse channels narrower than 55m. Reducing to 0.0001 (~11m) preserves most narrow features:
# config/workflow_config.yml
weighting:
buffer_zones:
simplify_tolerance: 0.0001 # ~11m (default was 0.0005 ~55m)
Trade-off: lower tolerance = larger land geometry = slightly longer buffer zone processing time.
What the code already does:
The build_ring_zones_postgis() CTE includes a land_filled step that strips interior rings from the land polygon before buffering. This prevents holes created by simplification from reaching ST_Difference, but cannot prevent topology breaks caused by sheer geometry complexity at complex coastlines.
Future fix:
A ST_Subdivide fallback is planned — if direct ST_Difference fails, the buffer will be broken into small tiles, each differenced locally against a clipped land fragment, then the partial rings unioned back together. This makes each individual GEOS operation tractable regardless of coastline complexity.
See also:
src/nautical_graph_toolkit/utils/geometry_utils.py—Buffer.build_ring_zones_postgis()docs/project/devlog.md— v0.1.5 Buffer Zone TopologyException entry
Testing Issues¶
Issue: pytest fails without PostGIS configured¶
Symptoms:
ValueError: invalid literal for int() with base 10: 'None'
# or
postgresql+psycopg2://None:None@None:None/None
Cause: Integration tests in tests/core__real_data/ require PostGIS environment variables to be set.
Solution 1: Run only unit tests (no PostGIS required)
Solution 2: Skip PostGIS integration tests The integration tests will automatically skip PostGIS tests if database environment variables are not set. You can run:
# Run all tests - PostGIS tests will be automatically skipped
pytest -v
# Run only file-based integration tests (GeoPackage, SpatiaLite)
pytest tests/core__real_data/ -v
Solution 3: Set up PostGIS for full integration test coverage If you want to run the complete integration test suite including PostGIS:
- Set up PostGIS database (see INSTALL.md Section 4)
- Create a
.envfile with database credentials: - Run the full test suite:
What tests run without PostGIS:
- ✅ All unit tests in
tests/core/ - ✅ Integration tests using GeoPackage backend
- ✅ Integration tests using SpatiaLite backend
- ❌ PostGIS-specific integration tests (automatically skipped)
Data Source Issues¶
Issue: File not found (GeoPackage/SpatiaLite)¶
Symptoms:
Solutions:
-
Verify the file exists:
-
Check file path is correct:
-
Ensure you've run the S-57 conversion first (see
docs/getting-started/setup.md)
Issue: Corrupted or incomplete data file¶
Symptoms:
Solutions:
-
Check file integrity:
-
Reconvert the S-57 data if corruption is confirmed
S57Updater: File-Based Backend Safety¶
Issue: Database corruption during S57Updater operations¶
Symptoms:
sqlite3.DatabaseError: database disk image is malformed
# or
DatabaseError: database disk image is malformed
# During S57Updater operations on GeoPackage or SpatiaLite
Backend Support Status:
| Backend | Update Status | Recommendation |
|---|---|---|
| PostGIS | ✅ Fully Supported | Production-ready with full ACID transactional guarantees |
| SpatiaLite | ⚠️ Use with Care | Works but can cause corruption with concurrent access |
| GeoPackage | ⚠️ Use with Care | Similar issues to SpatiaLite with concurrent access |
Root Cause:
The S57Updater uses two separate database connection mechanisms that can conflict:
- OGR/GDAL (via
ogr2ogr): Reads and writes spatial data - SQLAlchemy (via
sqlite3driver): Manages metadata and transactions
When both connections access the same file simultaneously without coordination, the file can become corrupted due to:
- Uncommitted write transactions from one connection being visible to the other
- Locking conflicts between OGR and SQLAlchemy
- Transaction isolation violations
Why PostGIS Doesn't Have This Issue:
- PostGIS uses a single client-server connection model
- PostgreSQL handles concurrent access properly with MVCC (Multi-Version Concurrency Control)
- All operations go through the same transactional interface
Solutions:
-
Use PostGIS for production updates (Recommended):
-
For file-based backends (SpatiaLite/GeoPackage) - Use isolated workflow:
# Create a FRESH database from initial ENC data updater = S57Updater( output_format='spatialite', dest_conn='path/to/fresh_database.sqlite', schema='main' ) # Run update on clean database (no concurrent access) updater.force_update_from_location( 'path/to/initial_enc_data', enc_filter=['US3CA52M', 'US1GC09M'] ) -
Avoid concurrent access to file-based databases:
- Close all QGIS, GIS software, or notebook connections before running S57Updater
- Do not access the database file while update is running
- Use separate output files for updates, then verify before replacing
Recovery from Corruption:
If you encounter this error:
-
Check file integrity:
-
If corruption is confirmed - Reconvert from source:
-
Prevent future corruption:
- Always use PostGIS for production update workflows
- For file-based testing, ensure single-access pattern
- Consider using separate output files for each update cycle
See also:
docs/notebooks/import_s57.ipynb- S57Updater section for usage examplesdocs/user-guides/workflow-postgis-guide.md- Setting up PostGIS for production use
Graph Creation Issues¶
Issue: Graph is disconnected warning¶
Symptoms:
WARNING - Graph is not connected. Selecting the largest component.
INFO - Selected largest component with 359,814 nodes and 1,430,984 edges.
Is this normal?
✅ Yes, this is expected behavior!
Explanation:
- Indicates some isolated water areas exist in the data (islands, separate water bodies)
- The code automatically selects the largest connected component
- This ensures pathfinding will work correctly
- Small isolated regions are removed to prevent routing errors
No action needed unless you specifically need those isolated regions.
Issue: Very few nodes created (graph too small)¶
Symptoms:
Solutions:
-
Check boundary covers water areas:
-
Increase expansion parameter:
-
Verify ENC data covers the area:
Issue: Database-side graph creation failed¶
Symptoms:
Is this normal?
✅ Yes, for GeoPackage and SpatiaLite backends!
Explanation:
- Database-side graph creation is currently only fully implemented for PostGIS
- GeoPackage and SpatiaLite automatically fall back to in-memory creation
- This may be slower but produces identical results
No action needed unless you need maximum performance (in which case, use PostGIS).
Issue: Out of memory during graph creation¶
Symptoms:
PostgreSQL/PostGIS-Specific Memory Error:
Symptoms:
psycopg2.errors.ProgramLimitExceeded: out of memory
DETAIL: Cannot enlarge string buffer containing 1073741681 bytes by 188 more bytes.
Cause:
- PostGIS uses
json_agg()to return graph results from database - For large regions, the JSON result (nodes + edges) exceeds PostgreSQL's
work_memlimit (~1GB) - Even with 32GB system RAM, PostgreSQL's internal buffer limit can be hit
- The error occurs in a single spatial subdivision region during 3×3 grid processing
Example log:
INFO - Subdividing into 3x3 grid (9 regions)
INFO - Processing region 1/9: (-122.8170, 33.3500) to (-121.1613, 34.9632)
ERROR - Error executing PostGIS graph creation: out of memory
DETAIL: Cannot enlarge string buffer containing 1073741681 bytes by 188 more bytes.
INFO - Processing region 2/9: ... (continues successfully)
Solutions (in order of preference):
-
Increase node spacing (fewer nodes per region = less memory):
-
Use finer subdivision with
max_subdivision_factor:
Expected output:
WARNING - max_subdivision_factor=5 > 4 may cause significant memory usage...
INFO - Subdividing into 5x5 grid (25 regions)
-
Reduce
max_pointsthreshold (triggers finer subdivision): -
Reduce the area of interest:
-
Use reduce_distance_nm to simplify geometry:
Note on max_subdivision_factor:
- Default: 4 (4×4 = 16 regions max)
- Range: 2-4 recommended, 5-6 for very large areas with adequate RAM
- Warning: Values > 4 trigger a warning about memory usage
- PostGIS-only: GeoPackage and SpatiaLite don't use this parameter (accepted for API consistency)
Why increasing work_mem doesn't help:
The error is not about PostgreSQL's work_mem setting alone. The issue creates three copies of data simultaneously:
- PostgreSQL's JSON result buffer (~1GB limit)
- Python's parsed JSON/dict structure
- NetworkX graph object
With 1-2 large regions, your 32GB RAM gets exhausted from these duplicate copies. Finer subdivision (smaller regions) is the real solution.
Issue: Fine grid (<0.1 NM) has disconnected components with visible gaps¶
Symptoms:
INFO - Found 462 disconnected components. Starting bridging process...
# or
WARNING - Graph is not connected. Many components found.
# Visual inspection shows regular vertical or horizontal gaps between node clusters
Recent Updates (v0.1.1):
The component bridging algorithm has been significantly improved to better handle subdivision seams:
Fixed Issues:
- Grid size detection: Now correctly detects 4x4 grids for graphs with 250K+ nodes (was 2x2)
- Boundary tolerance: Increased from 2x to 6x spacing to catch nodes near actual subdivision seams
- Connection tracking: Global tracking prevents nodes from exceeding 8 bridge connections
Results for 0.05NM spacing:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Nodes retained | 721,907 (89.7%) | 803,784 (99.92%) | +81,877 nodes |
| Boundary nodes | 3,626 | 6,937 | +91% |
| Bridge edges | 8,091 | 14,664 | +81% |
Explanation:
When creating very fine grids (spacing <0.1 NM), you may encounter artificial gaps between components due to:
- Spatial subdivision boundaries: For performance, PostGIS creates graphs using spatial subdivision (2x2, 4x4, or larger grids depending on node density). At ultra-fine resolutions (0.02 NM), this can create 16+ regions with visible seam lines between them.
- Numerical precision limits: Floating-point arithmetic can create tiny gaps at subdivision boundaries
- Grid generation artifacts: Regular rectangular grids may have alignment issues at region boundaries
These gaps typically appear as distinctive vertical or horizontal lines separating otherwise well-connected regions, aligned with subdivision boundaries.
Solutions:
-
Enable component bridging (Recommended for spacing <0.1 NM):
# In notebook settings fine_grid_spacing_nm = 0.02 # Ultra-fine spacing fine_graph_max_edge_factor = 3.0 # Allow longer edges for bridging fine_graph_bridge_components = True # Enable automatic bridging # The algorithm will: # 1. Detect subdivision grid size (2x2, 4x4, etc.) based on node count # 2. Calculate all subdivision seam lines (not just the center midpoint) # 3. Find boundary nodes near any seam line # 4. Apply full 8-way connectivity at seam boundaries for proper navigation # 5. Use limited bridging elsewhere to maintain graph quality -
How the bridging strategy works:
- Seam detection: Automatically detects NxN subdivision grids based on actual node count:
- 4x4 grid (>250K nodes): 3 vertical + 3 horizontal seam lines
- 3x3 grid (>60K nodes): 2 vertical + 2 horizontal seam lines
- 2x2 grid (>25K nodes): 1 vertical + 1 horizontal seam line
- Note: Thresholds account for ~40-60% land exclusion (expected_points vs actual_nodes)
- Two-tier bridging:
- Full seam bridging: Nodes near subdivision boundaries get up to 8 connections (standard grid connectivity)
- Sparse bridging: Other boundary nodes get limited connections (1-3 edges)
- Distance limit: Bridge edges limited to
max_edge_factor * spacingdistance
- Seam detection: Automatically detects NxN subdivision grids based on actual node count:
-
Increase max_edge_factor to allow slightly longer edges:
-
Slightly increase spacing if bridging doesn't fully resolve (rarely needed):
-
If bridging still misses seams (advanced):
Expected behavior with bridging enabled (0.02 NM, 4x4 grid):
INFO - Found 462 disconnected components. Starting bridging process...
INFO - Detected 4x4 subdivision grid (16 regions)
INFO - Vertical seam lines: ['-122.6967', '-122.4714', '-122.2461']
INFO - Horizontal seam lines: ['37.3040', '37.6081', '37.9121']
INFO - Identified boundary nodes for 462 components
INFO - Found 125,430 boundary nodes near subdivision lines
INFO - Bridged components 0 and 1 with 127 edges (seam strategy)
INFO - Bridged components 2 and 3 with 3 edges (sparse strategy)
INFO - Component bridging completed in 3.21s
INFO - Added 2,847 bridge edges
INFO - Components reduced from 462 to 1
Performance impact:
- Adds <5% to total graph creation time
- More efficient than increasing spacing significantly
- Preserves fine-resolution detail while maintaining full connectivity
- Scales automatically with subdivision grid size (2x2, 4x4, 8x8, etc.)
Issue: Graph files accumulate edges when notebooks are re-run¶
Symptoms:
# First notebook run
fine_graph_20.gpkg: 180,521 edges ✅ CORRECT
# Second run with same output filename
fine_graph_20.gpkg: 361,042 edges ❌ DOUBLED
# After directed conversion
fine_graph_directed_20.gpkg: 722,084 edges ❌ 4x CORRUPTED
Explanation:
❌ This was a BUG in save_graph_to_gpkg() method (fixed in v0.1.1)
Why this happened (before fix):
- Nodes layer used default
'w'mode (overwrite) ✅ - Edges layer used
mode='a'(append) ❌ - Re-running notebook with same filename:
- Nodes: REPLACED (correct count)
- Edges: ACCUMULATED (doubled, tripled, etc.)
- Result: Mismatched node/edge counts and corrupted graph data
Fix Applied (v0.1.1):
save_graph_to_gpkg()now deletes existing GeoPackage file before saving- Ensures clean overwrite on repeated notebook runs
- Prevents edge accumulation bug
If you see this issue:
- Upgrade to v0.1.1+ - Bug is fixed
- Delete corrupted graph files:
- Re-run notebooks - Will create clean files
Verification (v0.1.1+):
import geopandas as gpd
# After running notebook twice with same filename
edges = gpd.read_file('fine_graph_20.gpkg', layer='edges', read_geometry=False)
print(f"Edge count: {len(edges):,}")
# Should show consistent count (e.g., 180,521) on both runs
See also:
src/nautical_graph_toolkit/core/graph.py:1358- Fix implementation- CHANGELOG.md - v0.1.1 release notes
Performance expectations:
- Small graphs (10K-50K nodes): No noticeable impact
- Medium graphs (50K-200K nodes): 10-30% slower loading
- Large graphs (200K-1M nodes): Consider optimization
Performance Issues¶
Issue: Graph creation is very slow¶
Symptoms:
- Takes more than 5-10 minutes for moderate areas
- CPU usage is high for extended periods
Solutions:
-
Use PostGIS backend for large areas:
- Database-side creation is significantly faster
- Better memory management
- Can handle larger graphs
-
Reduce graph density:
-
Reduce boundary expansion:
-
Use reduce_distance_nm to simplify coastal geometry:
-
Monitor resource usage:
Performance Tuning Reference¶
| Parameter | Default | Impact | Recommendation |
|---|---|---|---|
expansion (nm) | 24 | ↑ = More area, slower | 12-36 for most cases |
spacing_nm | 0.3 | ↑ = Fewer nodes, faster | 0.3-0.5 for coastal, 0.5-1.0 for open ocean |
reduce_distance_nm | 0 | ↑ = Simpler geometry, faster | 3-5 for complex coastlines |
Example performance configurations:
# Fast (lower detail)
port_bbox = bbox.create_geo_boundary(..., expansion=12)
grid = pg_bg.create_base_grid(..., reduce_distance_nm=5)
G = pg_bg.create_base_graph(grid["combined_grid"], 0.5)
# Balanced (recommended)
port_bbox = bbox.create_geo_boundary(..., expansion=24)
grid = pg_bg.create_base_grid(..., reduce_distance_nm=3)
G = pg_bg.create_base_graph(grid["combined_grid"], 0.3)
# Detailed (slower, high precision)
port_bbox = bbox.create_geo_boundary(..., expansion=36)
grid = pg_bg.create_base_grid(..., reduce_distance_nm=0)
G = pg_bg.create_base_graph(grid["combined_grid"], 0.2)
Visualization Issues¶
Issue: Mapbox maps not displaying¶
Symptoms:
- Blank map
- Gray box where map should appear
- Error: "Mapbox access token required"
Solutions:
-
Verify MAPBOX_TOKEN is set:
-
Get a free Mapbox token:
- Visit: https://account.mapbox.com/access-tokens/
- Create a new token
- Add to
.envfile
-
Check token is valid:
- Test at: https://api.mapbox.com/styles/v1/mapbox/streets-v11?access_token=YOUR_TOKEN
Issue: Plotly maps not rendering in Jupyter¶
Symptoms:
<Figure size 640x480 with 0 Axes>- No interactive map appears
Solutions:
-
Set renderer:
-
For JupyterLab, install the extension:
-
Try alternative renderers:
Pathfinding Issues¶
Issue: No path found between ports¶
Symptoms:
Solutions:
-
Verify both ports are within the graph area:
-
Ensure graph is connected:
-
Increase boundary expansion to ensure ports are within navigable area:
Issue: Route looks unrealistic¶
Symptoms:
- Route goes far from expected path
- Unnecessary detours
- Doesn't follow shipping lanes
Explanation:
- Base routing only considers distance
- Does not account for shipping lanes, traffic, or maritime features
- This is expected behavior for base graphs
Solutions:
- Use directed graph with weights (see advanced notebooks)
- Apply traffic patterns and shipping lane preferences
- See:
graph_weighted_directed_postgis_v2.ipynb
Getting Help¶
If you encounter an issue not covered here:
-
Check the documentation:
docs/getting-started/setup.md- Initial setup and data conversiondocs/notebooks/- Example notebooksCLAUDE.md- Project overview
-
Review example notebooks:
- Compare your code to working examples
- Check cell outputs for expected results
-
Enable debug logging:
-
Report an issue:
- Include full error traceback
- Specify which notebook/backend you're using
- Provide system information (OS, Python version, package versions)
Appendix: Quick Reference¶
Checking Your Setup¶
Run this diagnostic cell to verify your environment:
import sys
import os
from pathlib import Path
print("=== Environment Check ===")
print(f"Python: {sys.version}")
print(f"Working Directory: {Path.cwd()}")
print(f"\n=== Environment Variables ===")
for var in ['DB_NAME', 'DB_USER', 'DB_HOST', 'DB_PORT', 'MAPBOX_TOKEN']:
value = os.getenv(var)
print(f"{var}: {'✓ Set' if value else '✗ Not set'}")
print(f"\n=== Module Imports ===")
try:
from src.nautical_graph_toolkit.core.s57_data import ENCDataFactory
print("nautical_graph_toolkit: ✓")
except ImportError as e:
print(f"nautical_graph_toolkit: ✗ ({e})")
try:
import geopandas
print(f"geopandas: ✓ (v{geopandas.__version__})")
except ImportError:
print("geopandas: ✗")
try:
import networkx
print(f"networkx: ✓ (v{networkx.__version__})")
except ImportError:
print("networkx: ✗")
print(f"\n=== Data Files ===")
data_file = Path.cwd() / "output" / "enc_west.gpkg"
print(f"GeoPackage: {'✓ Exists' if data_file.exists() else '✗ Not found'}")
Common Parameter Values¶
| Use Case | expansion | spacing_nm | reduce_distance_nm | max_edge_factor | bridge_components | max_subdivision_factor |
|---|---|---|---|---|---|---|
| Quick test | 12 | 0.5 | 5 | 2.0 | False | 4 |
| Coastal route | 24 | 0.3 | 3 | 2.0 | False | 4 |
| Open ocean | 36 | 0.5 | 0 | 2.0 | False | 4 |
| High precision | 24 | 0.2 | 0 | 2.0 | False | 4 |
| Very fine grid | 24 | 0.06 | 0 | 3.0 | True | 4 |
| Ultra-fine grid | 24 | 0.02 | 0 | 3.0 | True | 4 |
| Large area (24 deg²) | 50 | 0.12 | 5 | 2.0 | False | 4 |
| Large area (finer subdiv) | 50 | 0.12 | 5 | 2.0 | False | 5 |
Note on max_subdivision_factor:
- Default: 4 (4×4 = 16 regions) - works for most cases
- Use 5 (5×5 = 25 regions) for very large areas when you see memory errors
- Use 2 (2×2 = 4 regions) for small areas to reduce overhead
- PostGIS-only: Only affects PostGIS backend; GeoPackage/SpatiaLite ignore this parameter