Devlog
[0.1.5] - 2026-05-08¶
Weights System Restructuring & ML Pipeline Foundation¶
This release restructures the entire weighting architecture from a monolithic WeightsLegacy class into a modular, three-tier system with dual production/ML weight managers, a stateless calculation engine, and cross-backend support (GeoDataFrame, GeoPackage/SpatiaLite, PostGIS).
Release Focus: Extracting weight calculation logic into reusable components (2026-02 to 2026-04), adding ML-optimized weight tracking (WeightsOpen), vectorized spatial processing, and comprehensive test coverage (50K+ lines added across 51 files).
Added¶
Core Architecture: Modular Weight System¶
weight_calculator.py— Stateless weight calculation algorithms extracted from the legacy monolithWeightCalculatorclass: single source of truth for all weight logic- Three-tier methods:
calculate_blocking_factor()(Tier 1),calculate_penalty_factor()(Tier 2),calculate_bonus_factor()(Tier 3) encode_depth_bands(): 5-band UKC penalty system (Grounding → Restricted → Shallow → Safe → Deep)encode_ver_clearance_meters(): Vertical clearance encoding for bridges/cables/pipelinesapply_static_weights_vectorized(): Fully vectorized spatial join pipeline (shapely 2.0 + pandas groupby)calculate_directional_factor_from_bands(): Configurable angular difference bandscalculate_dynamic_safety_margin(): Environmental condition adjustments (weather, visibility, night)-
Smooth mode (
smooth_mode=True): Continuous exp/log weight functions for GNN/PyTorch pipelines_calculate_penalty_factor_smooth():1 + ln(1 + hazard_score * scale)— self-limiting logarithmic growth_calculate_bonus_factor_smooth():1 + exp(-k * preference_score)— exponential decay from open water to preferred- SQL expression builders for PostGIS (
_build_*_sql_expr) and GeoPackage (_build_*_gpkg_expr) - GeoDataFrame vectorized smooth mode (
_calculate_smooth_weights_gdf())
-
weights.py— Dual weight management system with ABC base class BaseWeights(abstract): Shared infrastructure — S57 classification, config loading, column categorization, buffer zone configuration, vessel parameter managementWeights(production): Aggregated three-tier weightsapply_static_weights_gdf(): Vectorized static weight computation with GDF backendapply_static_weights_sql(): SpatiaLite SQL-based processingapply_static_weights_postgis(): PostGIS server-side processingcalculate_dynamic_weights_gdf(): GeoDataFrame dynamic weight computationcalculate_dynamic_weights_sql(): SpatiaLite dynamic weightscalculate_dynamic_weights_postgis(): PostGIS dynamic weights- Three-tier aggregation: blocking (MAX), penalty (PRODUCT/MAX), bonus (MAX)
-
WeightsOpen(ML-optimized): Per-layer weight tracking- Same backend methods as
Weightsbut preserves individual layer contributions - Flat columns:
wt_{layer_name}(weight value) andwt_{layer_name}_n(feature count) per S-57 layer - Designed for GNN/PyTorch feature extraction pipelines
- Cross-validation against
Weightsto guarantee routing parity
- Same backend methods as
-
weight_optimization.py— ML pipeline utilities GraphWeightOptimizer(stateless): Validate, export, and import ML weight datavalidate_against_weights(): Verifies WeightsOpen produces identical routing to Weightsexport_for_pytorch(): Export layer weights as DataFrame, tensors, or dictencode_vessel_params(): Feature vector encoding for vessel parametersload_historical_routes(): Historical route data loading for trainingimport_learned_weights(): Apply learned weights back to graph
FineTuning(stateful): Database-side weight refinement operationsreapply_directional_weights(): Recalculate directional weights with updated angle bands- Bulk update operations via PostgisTableManager
Graph Conversion Enhancements¶
graph.py— Multi-backend directed graph conversionconvert_to_directed_gdf(): In-memory GeoDataFrame conversionconvert_to_directed_sql(): SpatiaLite SQL-based conversionconvert_to_directed_gpkg(): GeoPackage dispatcherconvert_to_directed_postgis(): Database-side PostGIS conversion- Deterministic ID assignment: forward edges 1→N, reverse edges N+1→2N
GraphConfigManager: Programmatic graph_config.yml reading/writing with comment preservation
Geometry Utilities¶
geometry_utils.py— ExtractedBufferandBearingutility classesBufferclass:- Nautical mile to degree conversion with latitude correction
apply_buffer_fine_gdf(): UTM-reprojected geodesically-accurate buffer (no post-filter needed)apply_buffer_fast_gdf(): Per-feature lat-corrected degree buffer with post-filterresolve_method(): Auto-selects 'fine' (Point/Area) vs 'fast' (Line-only) based on geometry types
Bearingclass:bearing_scalar(): Single bearing calculation (forward azimuth)bearing_gdf(): Vectorized NumPy bearing for GeoDataFramesangular_difference_scalar(): Scalar angular difference with 360° wrap-aroundangular_difference_gdf(): Vectorized angular difference- SQL fragments for SpatiaLite and PostGIS bearing calculations
Route Export¶
route_utils.py— RTZ (Route Exchange Format) exportRTZclass: Maritime route export in RTZ 1.2 XML formatfrom_linestring(): Load waypoints from Shapely LineStringfrom_geojson(): Create RTZ from GeoJSON fileto_xml()/save(): Generate and write RTZ XML- Cross-track distance (XTD), safety contour, depth configuration
- Geometry type selection (Loxodrome/Orthodrome)
PostGIS Bulk Operations¶
postgis_table_manager.py— TEMP table lifecycle managerPostisTableManager: Optimized bulk weight updates for large graphscreate(): TEMP table creation with session tuningupsert_from_select(): Bulk insert with conflict resolutionbulk_update_from(): Single UPDATE from temp tablectas_swap(): Create Table As Select for large updatesshould_use_ctas(): Heuristic decision between UPDATE vs CTAS strategy- Reduces dead tuples by ~95%, prevents autovacuum lock contention
PostGIS Subdivision Pipeline¶
s57_data.py— Server-sideST_Subdividepipeline replacing Python-side polygon clipping_ensure_navigable_area_table(): Creates indexed navigable polygon table with GiST index_write_polygon_to_indexed_table(): WKB-based polygon serialization for performance_create_subdivided_table(): Shatters navigable polygon into small indexed pieces viaST_Subdividewith ANALYZE_cleanup_subdivided_table(): Drops subdivided pieces table (called infinallyblock)_build_grid_query(): Parameterized grid SQL with pluggableST_Containsexpression_create_grid_graph_single(): Three modes —sub_table(ST_Subdivide GiST),polygon(inline WKT fallback), scalar subquery (small grids)_create_grid_graph_subdivided(): ST_Subdivide per-region queries replacepolygon.intersection()clipping; node deduplication at boundary overlaps
S-57 Classification Updates¶
s57_classification.py— Enhanced classification system- Extended feature classification with additional S-57 layer support
- Updated weight factors and buffer distances
- Buffer zone classification for coastal proximity ring penalties
s57object_definitions.csv— New S-57 object definition reference data (232 entries)
S-57 Data Manager Updates¶
s57_data.py— Enhanced database manager methodsPostGISManager:connectorproperty — lazily instantiatesPostGISConnectorfor advanced diagnostic operationsPostGISManager:verify_feature_update_status()— verifies Edition/Update values in feature layers correspond to DSID layer valuesSpatiaLiteManager/GPKGManager:verify_feature_update_status()— same verification for SQLite-based backends
Database Utilities¶
db_utils.py— Enhanced database operationspool_pre_ping=Trueon SQLAlchemy engine for connection liveness checksPostGISConnector.get_features()— filtered feature query with parameterized SQL, table/column validationFileDBConnector.get_features()— same API for GeoPackage/SpatiaLite with OGR WHERE clause fallback to SQLite- Database health monitoring suite:
check_active_queries(),check_table_locks(),check_table_bloat(),terminate_backend(),terminate_all_backends()(with dry_run), andcheck_database_health()(combined diagnostic with optional auto-remediation)
Configuration¶
workflow_config.yml— New workflow orchestration configuration- Database configuration (PostGIS and GeoPackage backends)
- Four-step pipeline control (base_graph → fine_graph → weighting → pathfinding)
- Vessel parameters (draft, height, safety margins, environmental conditions)
- Output management with auto-generated timestamped directories
- Performance benchmarking with CSV export
- A* algorithm selection (multiple maritime-specific variants)
- Three-tier coastal buffer zone system
reduce_distance_nm: 0→3 (standard maritime safety buffer)- Fine graph slice buffer enabled with expanded latitude range
graph_config.yml— Enhanced weight settings- Three-tier weight system configuration (blocking, penalty, bonus thresholds)
- WeightCalculator parameters (17 configurable constants)
- Directional weight angle bands
- Buffer zone thresholds (3.0, 4.0, 12.0 NM)
- Static layer classification with risk multipliers and buffer distances
- Coastal buffer zone penalties tightened: contiguous zone 1.8→2.0, territorial waters 1.3→1.8
Scripts¶
maritime_weights_workflow.py— Standalone weight computation script- Weight-only pipeline: enrich → static → directional → dynamic
- Supports GeoPackage and PostGIS backends
- Benchmark export and configuration validation
weight_benchmark.py— Weight computation benchmarking tool- Performance comparison across backends and modes
- Timing metrics and throughput analysis
ngt.py— Interactive CLI Launcher for Nautical Graph Toolkit- Three workflows: S-57 Import, Graph Pipeline, Weights Pipeline
- Questionary + Rich interactive prompts with dark theme styling
- Port autocomplete with PortData validation and canonical name lookup
- Config file discovery (
config/*.yml) with auto-select when only one exists - Dry-run preview for all workflows
- Cascading skip/edit phase: each pipeline step can be skipped or customized independently
- Backend selection (PostGIS, GeoPackage, SpatiaLite) with backend-aware prompts
- Temp config file management with atexit cleanup
- H3 navigable layer preview from
graph_config.yml - Bounding box expansion UI for slice buffer boundaries
- Vessel parameter form with type selection and numeric fields
- Command preview panel before execution with confirmation
- Port-to-port bbox derivation: automatic bounding box from port names via config or port database, replacing manual expansion UI
- Slice boundary rounding: outward-only rounding with proportional precision for bbox boundaries
compare_graphs.py— Cross-backend graph comparison utilitycompare_weights.py— Weight parity validation between Weights and WeightsOpengraph_alignment_test.py— Graph alignment verification script
Notebooks¶
graph_weighted_directed_Postgis_v2.ipynb— Updated PostGIS workflow- Per-layer weight tracking demonstration
- ML feature extraction for GNN pipelines
inspect_edge.ipynb— Cross-backend edge inspection tool- Side-by-side attribute comparison
- Tolerance checking for numerical differences
graph_weighted_directed_GeoPackage_v2.ipynb— Updated GeoPackage workflow- Mode selection (mem vs sql) for SpatiaLite processing
- Comprehensive benchmarking
pathfinding_compare.ipynb— Pathfinding algorithm comparisongeometry_utils.ipynb— Buffer and Bearing utility demonstrations
Documentation¶
docs/user-guides/workflow-weights-guide.md— Dedicated weights workflow guidedocs/reference/weights_system.md— Weights system technical referencedocs/user-guides/weights-workflow-example.md— Updated with new architectureconfig/test_config.yml— Test configuration template- RTZ Schema:
src/nautical_graph_toolkit/data/RTZ_Schema_version_1_2.xsd— RTZ 1.2 XSD schema definition
Buffer Zone OOM Fix & Configuration¶
geometry_utils.py—simplify_toleranceparameter (default0.0005degrees ≈ 55m) onBuffer.build_ring_zones_postgis()andBuffer.build_ring_zones_gpkg()— Douglas-Peucker simplification with double-simplify pattern for PostGIS (ST_SimplifyPreserveTopologypre- and post-union) and Shapelysimplify(preserve_topology=True)for GeoPackage;simplify_tolerance <= 0disables simplification for backward compatibilityweights.py—simplify_tolerancepassthrough onbuild_buffer_zones_gdf(),build_buffer_zones_sql(), and all 9 caller sites across GPKG/mem, GPKG/sql, SpatiaLite, and PostGIS code pathsgraph_config.yml—simplify_tolerance: 0.0005inweight_settings.buffer_zonesworkflow_config.yml—simplify_tolerance: 0.0005inweighting.buffer_zoneswith override in all three workflow scripts- 6 new tests:
test_simplify_appears_in_sql,test_simplify_disabled_with_zero,test_custom_simplify_tolerance,test_simplify_reduces_vertices,test_simplify_zero_skips,test_simplify_tolerance_passthrough
Testing Infrastructure (11 new test files, ~6,600+ lines)¶
Unit Tests¶
tests/core/test_weights.py(1,717 lines) — WeightCalculator and weight manager tests: depth bands, clearance, bearing, angular difference, directional factors, dynamic margins, tier degradation, smooth mode, vessel parameter validation, environmental conditionstests/core/test_buffer_zone_classify.py— Buffer zone classification teststests/core/test_convert_to_directed.py(489 lines) — Directed graph conversion teststests/core/test_fillet_smoothing.py— Fillet smoothing teststests/core/test_string_pulling.py— String pulling algorithm teststests/utils/test_bearing.py(228 lines) — Bearing calculation teststests/utils/test_buffer_zones.py(140 lines) — Buffer zone utility tests
Integration Tests (Real S-57 Data)¶
tests/core__real_data/conftest.py(265 lines) — Shared fixtures for real-data teststests/core__real_data/test_static_weights_cross_backend.py(789 lines) — Cross-backend static weight paritytests/core__real_data/test_bearing_cross_backend.py(769 lines) — Bearing calculation parity across backendstests/core__real_data/test_buffer_geometry_utils.py(474 lines) — Buffer geometry operationstests/core__real_data/test_buffer_land_geometry_utils.py(890 lines) — Land buffer geometrytests/core__real_data/test_buffer_methods.py(832 lines) — Buffer method comparison (fine vs fast)tests/core__real_data/test_convert_to_directed_real.py(398 lines) — Directed conversion with real datatests/core__real_data/test_enrich_features_cross_backend.py(503 lines) — Feature enrichment parity
Changed¶
weights.py—build_buffer_zones_postgis()refactored: consolidated 4 separate transactions into single txn withtemp_buffers/work_memtuning; ring geometries materialized into GiST-indexed tables instead of inline CTEs; edge classification uses multi-UPDATE with indexed spatial joins (largest zone first, nearest land wins); ring tables renamed on save or dropped on cleanupweights.py—build_buffer_zones_gdf(),build_buffer_zones_sql(): addedsimplify_toleranceparameter passthrough toBuffer.build_ring_zones_gpkg()geometry_utils.py—Buffer.build_ring_zones_postgis()andbuild_ring_zones_gpkg(): addedsimplify_toleranceparameter for land geometry simplificationweights_legacy.py→ Removed; legacy code fully refactored into modularweights.pyarchitecturegraph.py: Multipleconvert_to_directedbackends replacing single NetworkX conversion_bridge_disconnected_components(): Uses actual subdivision factor from database instead of node-count heuristic; removed 3-edge cap on non-seam bridges that caused geographic gaps at subdivision boundariescreate_base_graph()/create_grid_subgraph()/_create_grid_subgraph_database_side(): Pass throughtable_prefix/grid_schemafor indexed table managementGraphUtils._tuple_str(): Deterministic node serialization with plain Python floatsload_node_mapping(): Regex fallback parser fornp.float64(...)values in node stringsoverwriteflag onexport_postgis_to_gpkg; default auto-increments filename instead of raisingFileExistsErrorpathfinding_lite.py— Major expansion: new A* variants, Rustworkx acceleration, and route smoothing (+2,236 lines)Astar(base):min_cost_factorfor scaled heuristic admissibility when bonus weights < 1.0; lazy STRtree edge cache via_get_edge_tree()AstarImproved: New subclass — "pilot quantity" heuristic favoring straighter pathsAstarMaritime: Two-Pass Corridor Routing — A* scout (Pass 1) identifies rough course, Dijkstra (Pass 2) finds optimal route within a spatial corridor- Internal workflow: latitude-corrected NM-to-degree buffer corridor construction, STRtree-accelerated edge filtering, TSS lane enrichment (RECTRC, FAIRWY, TSSLPT)
compute_route_maritime(): Two-pass orchestration withget_maritime_metrics()diagnosticsexport_debug_gpkg(): Multi-layer GeoPackage export for QGIS inspection (corridor, TSS nodes, obstacle edges, pass paths)- Optional Rustworkx backend for Pass 1 with graceful NetworkX fallback
AstarMaritimeSmooth: Three-Pass Maritime Routing — inherits Passes 1–2, adds String-Pulling post-processing (Pass 3)- Internal workflow: greedy line-of-sight shortcutting, STRtree-indexed obstacle classification, land subtraction + channel expansion for shortcut containment, multi-backend land geometry loading
compute_route_maritime_smooth(): Three-pass orchestration withpass1_backendselectorRoute(enhanced):apply_fillet_smoothing(): Bearing-merge simplification + zone-based circular arc fillets with safety validationforced_route(): Multi-waypoint routing with scout-path construction, maritime/standard dispatch, and fillet smoothing pipelinebase_route(): Dispatcher resolving AstarMaritimeSmooth → AstarMaritime → AstarImproved → Astar method chainsave_route_to_file()/save_detailed_route_to_file(): Export to GeoPackage, GeoJSON, or CSV- Weight column handling updated for three-tier system (
blocking_factor,penalty_factor,bonus_factor,adjusted_weight)
s57_classification.py: Extended classifications, updated weight factors and buffer distancesgeometry_utils.py: Major expansion with Buffer and Bearing class extraction (+1,305 lines)_normalize_ring_geometry(): Extracts polygonal components from GeometryCollection resultsREADME.md: Added Interactive Launcher section withngt.pyusage and feature overviewdb_utils.py: Updated for new weight column schemaimport_s57.py: Enhanced with benchmarking and validationmaritime_graph_postgis_workflow.py: Updated for new weights API; passestable_prefix/grid_schemato graph creation; fullslice_bboxparameters;bridge_componentsenabled for base graphmaritime_graph_geopackage_workflow.py: Updated for new weights API;bridge_componentsenabled for base graph- Documentation updates (8 files):
scripts-guide.md,weights-workflow-example.md,workflow-geopackage-guide.md,workflow-postgis-guide.md,workflow-s57-import-guide.md,setup.md,troubleshooting.md— updated for new weights API and script references - Config/tooling:
.gitignore,.env.example,pytest.ini— updated for new output patterns and test configuration
Fixed¶
- Buffer Zone OOM crash on large PostGIS schemas —
build_buffer_zones_postgis()refactored from a single mega-CTE that re-materialized ring geometries per-row (crashing PostgreSQL via OOM killer on 6,933 ENC / 1.67M edge graphs) to a 4-phase approach: pre-materialize rings into GiST-indexed tables, then classify edges via indexed spatial joins SET temp_buffersfailure in buffer zone classification —temp_buffersnow set at transaction start before any temp table operations, wrapped inbegin_nested()savepoint for graceful skip- Connection safety in
build_buffer_zones_sql— replaced rawengine.connect()withengine.begin()context managers - UKC and vertical clearance calculation alignment between PostGIS and GeoPackage backends
_sjoin_and_aggregatereturn type unified — always returns(edge_values, edge_sources)tuple- Edge accumulation on repeated notebook runs (GeoPackage file deletion before save)
- SpatiaLite artifact cleanup after processing
- CLI Back navigation — graph and weights flows properly return to main menu on Back/Cancel
- PostGIS ring zones use
ST_CollectionExtract(..., 3)for polygon-only output - SpatiaLite
bearing_sql()— useMOD()instead of Python-style modulo for compatibility - Subdivision seam gaps — removed 3-edge cap on non-seam bridging in
_bridge_disconnected_components()that left geographic discontinuities at subdivision boundaries - Subdivision factor mismatch —
_bridge_disconnected_components()now receives actualsubdivision_factorfrom database instead of estimating from node count - Edge NaN attributes — NaN values from pandas skipped during graph edge attribute loading (
pd.notnacheck) - Node string deserialization —
load_node_mapping()handlesnp.float64(...)values in GeoPackage node strings via regex fallback parser
Performance Improvements¶
- PostGIS buffer zone classification: GiST-indexed spatial joins replace per-row CTE re-materialization — reduces 1.67M × 3 correlated subqueries to 3 indexed spatial joins; memory drops from all ring geometries in RAM simultaneously to one ring at a time indexed on disk
- Land geometry simplification (
simplify_tolerance=0.0005) reduces vertex count before buffer operations, lowering memory forST_UnionandST_Bufferin PostGIS and Shapely buffer in GeoPackage - PostGIS grid creation:
ST_Subdividewith GiST-indexed point-in-polygon replaces Python-sidepolygon.intersection()clipping — eliminates O(n) polygon serialization per region - Vectorized static weight computation using shapely 2.0 + pandas groupby (replaces row-by-row processing)
- PostGIS server-side weight computation via TEMP tables (reduces dead tuples ~95%)
- Chunked processing support for memory management on large graphs
- Spatial index acceleration via GeoPandas sjoin (auto-builds STRtree)
- Rustworkx backend for A* Pass 1 search (Rust-native A* replacing NetworkX Python A*)
- STRtree spatial index for corridor construction, obstacle detection, and line-of-sight checks
- Lazy edge tree caching shared between pathfinder and Route classes
Statistics¶
- 51 files changed: 50,417 insertions, 1,061 deletions
- New files: 28 (core: 3, utils: 2, scripts: 4, tests: 11, notebooks: 4, docs/config: 4)
- Modified files: 23
- Test coverage: 11 new test files (~6,600+ lines of tests)
Performance Benchmarking & Documentation (2026-05-18)¶
Updated benchmarking infrastructure and technical documentation to reflect the full 4-stage pipeline introduced in v0.1.5.
Added¶
docs/notebooks/performance_metrics.ipynb: Interactive Plotly notebook with 5 figures — Base Graph Creation, Fine/H3 Graph Creation, Weighting & Enrichment, Pathfinding & Export, Total Pipeline Time — all by backend and graph mode from CSV data.scripts/benchmarks/MaritimeWorkflowPerformanceMetrics.csv: 45 workflow runs (33 PostGIS, 12 GeoPackage, 6 partial) with columns:Source,Workflow ID,Backend,Graph Mode,Precision,NGT Ver,Base Graph Creation (s),Fine/H3 Graph Creation (s),Weighting & Enrichment (s),Pathfinding & Export (s),Total Time (s),Final Nodes Count,Final Undirected Edges Count,Platform. Edge range: 0.2M–13.5M.scripts/benchmarks/graph_size.csv: 135 PostGIS graph table sizes withtotal_size,table_size,indexes_sizecolumns. Covers weighted (_wt_prefix) and unweighted fine/H3 graphs across all routes.
Changed¶
docs/reference/technical-specs.md— Restructured from spacing-based base graph tables to full 4-stage pipeline:- New pipeline overview table (12 representative runs across backends and graph modes)
- Weighting & Enrichment backend comparison (GeoPackage 3.6× slower than PostGIS at ~1.2M edges)
- AstarMaritimeSmooth 3-pass pathfinding documentation (A* scout → Dijkstra corridor → string-pulling)
- Weighted graph storage section with PostGIS data (weighted graphs 10–14× unweighted, up to 40 GB for large H3 graphs)
- Updated storage planning guidelines for weighted pipeline reality
- Retained fine graph creation performance (Stage 2 isolated) — code unchanged from earlier versions
- Dropped graph name references; uses node/edge counts + graph mode (FINE/H3) throughout
Key Findings¶
- Weighting & Enrichment dominates: 33–86% of total pipeline time (median ~59%), scaling linearly with directed edge count
- GeoPackage weighting penalty: 2,806s vs 782s at ~1.2M edges (3.6×) — row-by-row SpatiaLite vs bulk PostGIS with GiST-indexed spatial joins
- Weighted storage explosion: 10–14× unweighted size in PostGIS due to 15–25 additional columns per directed edge; edge tables 95%+ of total
- Pathfinding scales sub-linearly: A* corridor search with string-pulling remains efficient even at 6.5M+ directed edges
Buffer Zone TopologyException Fix (2026-05-20)¶
Problem¶
ST_Difference in build_ring_zones_postgis() threw TopologyException: unable to assign free hole to a shell at coordinate -117.905, 33.614 (Newport Beach, CA). The error occurred during buffer zone ring materialization (ring_3_0) when computing the difference between a 3 NM geography buffer and simplified land geometry.
Root Cause¶
ST_SimplifyPreserveTopology with tolerance 0.0005° (~55m) collapses narrow coastal waterways (rivers, harbor channels) narrower than the tolerance. This creates isolated interior rings in the land polygon. When GEOS computes ST_Difference(buffer, land), the noding of two very complex geometries produces the "free hole" topology break internally. Neither ST_MakeValid nor ST_Buffer(geom, 0) on the inputs fixes this because the error occurs inside the boolean operation, not in the input geometries.
Investigation¶
Attempted fixes that did not resolve the issue: 1. ST_MakeValid wrapping on both ST_Difference operands — error persists inside GEOS 2. ST_Buffer(ST_MakeValid(...), 0) — same result, topology rebuild doesn't help 3. land_filled CTE using ST_BuildArea(ST_Collect(ST_ExteriorRing(...))) to strip all interior rings — correctly removes holes from land but the ST_Difference of complex buffer vs complex land still fails at Newport
Changes¶
geometry_utils.py— Addedland_filledCTE tobuild_ring_zones_postgis()that strips all interior rings from land before buffering and differencing. UsesLATERAL ST_Dump+ST_ExteriorRing+ST_BuildAreapattern. Prevents simplification-created holes from reaching the ring difference step.geometry_utils.py— Changedsimplify_tolerancedefault from0.0005to0.0001(~11m). Lower tolerance preserves narrow channels, preventing the free-hole condition at complex coastlines.
Workaround¶
For datasets with complex coastlines, set simplify_tolerance: 0.0001 in workflow_config.yml or graph_config.yml. Trade-off: larger land geometry → slightly longer buffer zone processing.
Open TODO¶
ST_Subdividefallback: if directST_Differencefails, break the buffer into small tiles, difference each tile locally against a clipped land fragment, union the partial rings. This would make the operation robust for any coastline complexity regardless of simplification tolerance.