← Back to Projects
Case Study

Climate Risk Data Infrastructure at Raincoat LLC

How I contributed to Raincoat's climate data infrastructure — building an automated satellite data pipeline, a modular risk processing chain, and an interactive dashboard used by the CEO to close deals.

Feb 2024 — Jan 2026 Senior Data Scientist Remote — US-based client

The Client

Raincoat LLC is a US-based insurtech company specializing in parametric insurance — policies that pay out automatically when predefined weather or climate thresholds are triggered. Their products cover natural catastrophe risks including floods, droughts, hurricanes, and extreme heat events across emerging markets.

I joined Raincoat's data science team as a senior data scientist, contributing to the data infrastructure and ML models powering their risk assessment engine — from raw satellite data all the way to executive-facing dashboards.

The Challenge

Massive, Heterogeneous Data

2TB+ of satellite imagery and climate reanalysis data from Copernicus CDS, WEkEO, and ERA5 — in NetCDF format, spanning decades, with varying spatial and temporal resolutions.

No Standardized Risk Framework

Each peril (flood, drought, heat, cyclone) required different variables, thresholds, and modeling approaches. The team had to build a framework that could scale one peril to new countries without rebuilding from scratch.

Decision-Makers Needed Clarity

Executives and underwriters needed to visualize risk at the country and regional level — not raw model outputs. The gap between "model results in a notebook" and "actionable intelligence in a dashboard" was the bottleneck.

What I Built

Automated Satellite Data Pipeline

Designed and implemented a fully automated ingestion pipeline pulling data from Copernicus CDS and WEkEO APIs. The pipeline handles download scheduling, format conversion (NetCDF to tabular), spatial aggregation by administrative boundaries, and quality checks. Reduced manual data preparation from weeks to hours.

ML Processing Chain

Built a modular processing chain supporting multiple peril types. Each module extracts relevant climate variables, computes risk indicators (return periods, anomaly scores, threshold exceedances), and feeds standardized outputs into the risk engine. Designed for extensibility — adding a new peril or country requires configuration, not code changes.

3 Interactive Dashboards

Delivered three Plotly Dash dashboards used daily by the risk and underwriting teams:

  • Country Risk Explorer — Interactive maps with country-level risk scores, historical event timelines, and drill-down by peril type

The Impact

2TB+ Satellite data managed
10+ Countries covered
3 Production dashboards
90% Reduction in data prep time

How We Worked

I operated as a fully embedded senior data scientist within Raincoat's remote team for nearly two years. The engagement model was time-and-materials with weekly check-ins, shared GitHub repos, and direct Slack communication with the CTO and risk team.

Work was organized in 2-week sprints with clear deliverables. I focused on the data pipeline, processing chain, and dashboards — collaborating closely with the rest of the data science and engineering teams on deployment and integration.

Timezone overlap (EU mornings / US East Coast) enabled real-time collaboration without blocking either side.

Tech Stack

Python Copernicus CDS WEkEO NetCDF Xarray Pandas Plotly Dash PostgreSQL Docker scikit-learn NumPy Git

Have a similar challenge?

I help companies turn raw environmental and geospatial data into production-ready ML systems and dashboards. Let's talk about your project.