A production-ready toolkit for calculating boulder dimensions (length, width, height) from Multibeam Echosounder (MBES) data. Designed for geologists, marine researchers, and environmental scientists.
- Key Features
- Workflow Diagram
- Installation
- Usage
- Data Specifications
- Advanced Features
- Testing & Validation
- Contributing
- License
Feature | Description | Technology Used |
---|---|---|
Automated Dimension Extraction | Calculates length/width via polygon orientation and height via bathymetric raster analysis. | GDAL , Shapely , PCA |
Parallel Processing | 4x faster processing using thread pools for large datasets. | concurrent.futures , Dask |
QGIS Integration | Runs as standalone script or QGIS plugin with GUI. | PyQGIS , Qt Designer |
Error Resilience | Auto-skipping invalid geometries with detailed error logging. | logging , Sentry (optional) |
3D Visualization | Optional output for visualizing boulders in 3D space. | Matplotlib , PyVista |
MBES Data → Polygon Input → Centroid Calculation → PCA Orientation → Raster Sampling → Dimension Export
│ │
└── Error Handling ←───────┘
# 1. Build the QGIS-enabled container
docker build -t boulder-calculator .
# 2. Run processing (mount data to /data)
docker run -v /path/to/your/data:/data boulder-calculator \
--input /data/boulders.shp \
--raster /data/bathymetry.tif \
--output /data/results.shp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
.venv\Scripts\activate # Windows
# Install with PyQGIS support
pip install "qgis>=3.28" geopandas rasterio sentry-sdk
python boulder_calculator.py \
--input "path/to/boulders.shp" \
--raster "path/to/bathymetry.tif" \
--output "results.shp" \
--workers 8 # Use 8 CPU cores
- Copy the
boulder_plugin
folder to~/.local/share/QGIS/QGIS3/profiles/default/python/plugins/
- Enable via Plugins → Manage and Install Plugins
- Access via toolbar:
File Type | Fields | CRS | Example |
---|---|---|---|
Polygons (SHP) | boulder_id , geometry |
EPSG:4326 | Sample Data |
Raster (GeoTIFF) | Bathymetric depth values | Must match vector CRS | Sample Raster |
Field | Type | Description |
---|---|---|
centroid |
Point | Boulder center (WGS84) |
length_m |
Float | Longest axis (meters) |
width_m |
Float | Shortest axis (meters) |
height_m |
Float | Elevation difference (meters) |
confidence |
Float | Data quality score (0-1) |
from sklearn.ensemble import IsolationForest
# Remove outlier boulders during post-processing
model = IsolationForest(contamination=0.05)
boulders["is_outlier"] = model.fit_predict(boulders[["length_m", "width_m"]])
clean_boulders = boulders[boulders["is_outlier"] != -1]
# Process directly from AWS S3
python boulder_calculator.py \
--input "s3://marine-data/boulders.shp" \
--raster "s3://marine-data/bathymetry.tif" \
--output "s3://results-bucket/output.shp"
# Run unit/integration tests
pytest tests/ --cov=src --cov-report=html
# Generate test coverage report
open htmlcov/index.html
Validation Checks:
- Polygon geometry validity (non-intersecting, closed rings)
- Raster resolution ≥ 1m/pixel
- CRS consistency between vector/raster
- Fork the repository
- Create feature branch:
git checkout -b feat/new-algorithm
- Submit PR with:
- Tests in
tests/
- Updated documentation
- Type hints for new functions
- Tests in
MIT License - see LICENSE for details.
---