Skill

geomaster

Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, cloud-native workflows (STAC, COG, Planetary Computer), and 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task.

Freerisk: low
geomasterrustpythonjavagojavascript

Tools: rsgislib,scikit-learn,osmnx,cartopy,xarray,pystac-client,laspy,rasterio,numpy,geopandas

The full skill

— name: geomaster description: Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, cloud-native workflows (STAC, COG, Planetary Computer), and 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task. license: MIT License metadata: skill-author: K-Dense Inc. — # GeoMaster Comprehensive geospatial science skill covering GIS, remote sensing, spatial analysis, and ML for Earth observation across 70+ topics with 500+ code examples in 8 programming languages. ## Installation “`bash # Core Python stack (conda recommended) conda install -c conda-forge gdal rasterio fiona shapely pyproj geopandas # Remote sensing & ML uv pip install rsgislib torchgeo earthengine-api uv pip install scikit-learn xgboost torch-geometric # Network & visualization uv pip install osmnx networkx folium keplergl uv pip install cartopy contextily mapclassify # Big data & cloud uv pip install xarray rioxarray dask-geopandas uv pip install pystac-client planetary-computer # Point clouds uv pip install laspy pylas open3d pdal # Databases conda install -c conda-forge postgis spatialite “` ## Quick Start ### NDVI from Sentinel-2 “`python import rasterio import numpy as np with rasterio.open('sentinel2.tif') as src: red = src.read(4).astype(float) # B04 nir = src.read(8).astype(float) # B08 ndvi = (nir – red) / (nir + red + 1e-8) ndvi = np.nan_to_num(ndvi, nan=0) profile = src.profile profile.update(count=1, dtype=rasterio.float32) with rasterio.open('ndvi.tif', 'w', **profile) as dst: dst.write(ndvi.astype(rasterio.float32), 1) “` ### Spatial Analysis with GeoPandas “`python import geopandas as gpd # Load and ensure same CRS zones = gpd.read_file('zones.geojson') points = gpd.read_file('points.geojson') if zones.crs != points.crs: points = points.to_crs(zones.crs) # Spatial join and statistics joined = gpd.sjoin(points, zones, how='inner', predicate='within') stats = joined.groupby('zone_id').agg({ 'value': ['count', 'mean', 'std', 'min', 'max'] }).round(2) “` ### Google Earth Engine Time Series “`python import ee import pandas as pd ee.Initialize(project='your-project') roi = ee.Geometry.Point([-122.4, 37.7]).buffer(10000) s2 = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED') .filterBounds(roi) .filterDate('2020-01-01', '2023-12-31') .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20))) def add_ndvi(img): return img.addBands(img.normalizedDifference(['B8', 'B4']).rename('NDVI')) s2_ndvi = s2.map(add_ndvi) def extract_series(image): stats = image.reduceRegion(ee.Reducer.mean(), roi.centroid(), scale=10, maxPixels=1e9) return ee.Feature(None, {'date': image.date().format('YYYY-MM-dd'), 'ndvi': stats.get('NDVI')}) series = s2_ndvi.map(extract_series).getInfo() df = pd.DataFrame([f['properties'] for f in series['features']]) df['date'] = pd.to_datetime(df['date']) “` ## Core Concepts ### Data Types | Type | Examples | Libraries | |——|———-|———–| | Vector | Shapefile, GeoJSON, GeoPackage | GeoPandas, Fiona, GDAL | | Raster | GeoTIFF, NetCDF, COG | Rasterio, Xarray, GDAL | | Point Cloud | LAS, LAZ | Laspy, PDAL, Open3D | ### Coordinate Systems – **EPSG:4326** (WGS 84) – Geographic, lat/lon, use for storage – **EPSG:3857** (Web Mercator) – Web maps only (don't use for area/distance!) – **EPSG:326xx/327xx** (UTM) – Metric calculations, <1% distortion per zone – Use `gdf.estimate_utm_crs()` for automatic UTM detection “`python # Always check CRS before operations assert gdf1.crs == gdf2.crs, "CRS mismatch!" # For area/distance calculations, use projected CRS gdf_metric = gdf.to_crs(gdf.estimate_utm_crs()) area_sqm = gdf_metric.geometry.area “` ### OGC Standards – **WMS**: Web Map Service – raster maps – **WFS**: Web Feature Service – vector data – **WCS**: Web Coverage Service – raster coverage – **STAC**: Spatiotemporal Asset Catalog – modern metadata ## Common Operations ### Spectral Indices “`python def calculate_indices(image_path): """NDVI, EVI, SAVI, NDWI from Sentinel-2.""" with rasterio.open(image_path) as src: B02, B03, B04, B08, B11 = [src.read(i).astype(float) for i in [1,2,3,4,5]] ndvi = (B08 – B04) / (B08 + B04 + 1e-8) evi = 2.5 * (B08 – B04) / (B08 + 6*B04 – 7.5*B02 + 1) savi = ((B08 – B04) / (B08 + B04 + 0.5)) * 1.5 ndwi = (B03 – B08) / (B03 + B08 + 1e-8) return {'NDVI': ndvi, 'EVI': evi, 'SAVI': savi, 'NDWI': ndwi} “` ### Vector Operations “`python # Buffer (use projected CRS!) gdf_proj = gdf.to_crs(gdf.estimate_utm_crs()) gdf['buffer_1km'] = gdf_proj.geometry.buffer(1000) # Spatial relationships intersects = gdf[gdf.geometry.intersects(other_geometry)] contains = gdf[gdf.geometry.contains(point_geometry)] # Geometric operations gdf['centroid'] = gdf.geometry.centroid gdf['simplified'] = gdf.geometry.simplify(tolerance=0.001) # Overlay operations intersection = gpd.overlay(gdf1, gdf2, how='intersection') union = gpd.overlay(gdf1, gdf2, how='union') “` ### Terrain Analysis “`python def terrain_metrics(dem_path): """Calculate slope, aspect, hillshade from DEM.""" with rasterio.open(dem_path) as src: dem = src.read(1) dy, dx = np.gradient(dem) slope = np.arctan(np.sqrt(dx**2 + dy**2)) * 180 / np.pi aspect = (90 – np.arctan2(-dy, dx) * 180 / np.pi) % 360 # Hillshade az_rad, alt_rad = np.radians(315), np.radians(45) hillshade = (np.sin(alt_rad) * np.sin(np.radians(slope)) + np.cos(alt_rad) * np.cos(np.radians(slope)) * np.cos(np.radians(aspect) – az_rad)) return slope, aspect, hillshade “` ### Network Analysis “`python import osmnx as ox import networkx as nx # Download and analyze street network G = ox.graph_from_place('San Francisco, CA', network_type='drive') G = ox.add_edge_speeds(G).add_edge_travel_times(G) # Shortest path orig = ox.distance.nearest_nodes(G, -122.4, 37.7) dest = ox.distance.nearest_nodes(G, -122.3, 37.8) route = nx.shortest_path(G, orig, dest, weight='travel_time') “` ## Image Classification “`python from sklearn.ensemble import RandomForestClassifier import rasterio from rasterio.features import rasterize def classify_imagery(raster_path, training_gdf, output_path): """Train RF and classify imagery.""" with rasterio.open(raster_path) as src: image = src.read() profile = src.profile transform = src.transform # Extract training data X_train, y_train = [], [] for _, row in training_gdf.iterrows(): mask = rasterize([(row.geometry, 1)], out_shape=(profile['height'], profile['width']), transform=transform, fill=0, dtype=np.uint8) pixels = image[:, mask > 0].T X_train.extend(pixels) y_train.extend([row['class_id']] * len(pixels)) # Train and predict rf = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1) rf.fit(X_train, y_train) prediction = rf.predict(image.reshape(image.shape[0], -1).T) prediction = prediction.reshape(profile['height'], profile['width']) profile.update(dtype=rasterio.uint8, count=1) with rasterio.open(output_path, 'w', **profile) as dst: dst.write(prediction.astype(rasterio.uint8), 1) return rf “` ## Modern Cloud-Native Workflows ### STAC + Planetary Computer “`python import pystac_client import planetary_computer import odc.stac # Search Sentinel-2 via STAC catalog = pystac_client.Client.open( "https://planetarycomputer.microsoft.com/api/stac/v1", modifier=planetary_computer.sign_inplace, ) search = catalog.search( collections=["sentinel-2-l2a"], bbox=[-122.5, 37.7, -122.3, 37.9], datetime="2023-01-01/2023-12-31", query={"eo:cloud_cover": {"lt": 20}}, ) # Load as xarray (cloud-native!) data = odc.stac.load( list(search.get_items())[:5], bands=["B02", "B03", "B04", "B08"], crs="EPSG:32610", resolution=10, ) # Calculate NDVI on xarray ndvi = (data.B08 – data.B04) / (data.B08 + data.B04) “` ### Cloud-Optimized GeoTIFF (COG) “`python import rasterio from rasterio.session import AWSSession # Read COG directly from cloud (partial reads) session = AWSSession(aws_access_key_id=…, aws_secret_access_key=…) with rasterio.open('s3://bucket/path.tif', session=session) as src: # Read only window of interest window = ((1000, 2000), (1000, 2000)) subset = src.read(1, window=window) # Write COG with rasterio.open('output.tif', 'w', **profile, tiled=True, blockxsize=256, blockysize=256, compress='DEFLATE', predictor=2) as dst: dst.write(data) # Validate COG from rio_cogeo.cogeo import cog_validate cog_validate('output.tif') “` ## Performance Tips “`python # 1. Spatial indexing (10-100x faster queries) gdf.sindex # Auto-created by GeoPandas # 2. Chunk large rasters with rasterio.open('large.tif') as src: for i, window in src.block_windows(1): block = src.read(1, window=window) # 3. Dask for big data import dask.array as da dask_array = da.from_rasterio('large.tif', chunks=(1, 1024, 1024)) # 4. Use Arrow for I/O gdf.to_file('output.gpkg', use_arrow=True) # 5. GDAL caching from osgeo import gdal gdal.SetCacheMax(2**30) # 1GB cache # 6. Parallel processing rf = RandomForestClassifier(n_jobs=-1) # All cores “` ## Best Practices 1. **Always check CRS** before spatial operations 2. **Use projected CRS** for area/distance calculations 3. **Validate geometries**: `gdf = gdf[gdf.is_valid]` 4. **Handle missing data**: `gdf['geometry'] = gdf['geometry'].fillna(None)` 5. **Use efficient formats**: GeoPackage > Shapefile, Parquet for large data 6. **Apply cloud masking** to optical imagery 7. **Preserve lineage** for reproducible research 8. **Use appropriate resolution** for your analysis scale ## Detailed Documentation – **[Coordinate Systems](references/coordinate-systems.md)** – CRS fundamentals, UTM, transformations – **[Core Libraries](references/core-libraries.md)** – GDAL, Rasterio, GeoPandas, Shapely – **[Remote Sensing](references/remote-sensing.md)** – Satellite missions, spectral indices, SAR – **[Machine Learning](references/machine-learning.md)** – Deep learning, CNNs, GNNs for RS – **[GIS Software](references/gis-software.md)** – QGIS, ArcGIS, GRASS integration – **[Scientific Domains](references/scientific-domains.md)** – Marine, hydrology, agriculture, forestry – **[Advanced GIS](references/advanced-gis.md)** – 3D GIS, spatiotemporal, topology – **[Big Data](references/big-data.md)** – Distributed processing, GPU acceleration – **[Industry Applications](references/industry-applications.md)** – Urban planning, disaster management – **[Programming Languages](references/programming-languages.md)** – Python, R, Julia, JS, C++, Java, Go, Rust – **[Data Sources](references/data-sources.md)** – Satellite catalogs, APIs – **[Troubleshooting](references/troubleshooting.md)** – Common issues, debugging, error reference – **[Code Examples](references/code-examples.md)** – 500+ examples — **GeoMaster covers everything from basic GIS operations to advanced remote sensing and machine learning.**