Creating flood extent shapefiles from satellite imagery

GMU

This was a team project that was conducted for my final capstone course while at GMU. This project spanned an entire semester where I worked with a team of 5 other students. The team collaborated using Aglie and Scrum processes meeting on a daily basis to ensure proper progress and met with the client once a week. The team consisted of one scrum master, one product owner, and 4 developers randomly assigned. I was given the role of product owner and led the development of the project as well as client facing meetings.

The sections below highlight the abstract of the work, objectives, and conclusion summary. Read the full report here and access team Github repository here.

Abstract

Puerto Rico faces heightened flood risk due to its coastal geography and the intensifying effects of climate change, which have increased the frequency and severity of tropical storms and hurricanes. This project addresses the urgent need for accurate, near-real-time flood extent mapping to support disaster response efforts in flood-prone regions. This research focuses on generating post-storm flood extent shapefiles by leveraging Sentinel-1 (SAR) data, which offers cloud-penetrating capabilities, crucial for timely flood assessment. The project utilizes a combination of thresholding methods and conditional logic, including change detection, Otsu’s adaptive thresholding, and VV/VH polarization ratio analysis to classify flood-affected regions. The processed data is then converted into GIS-compatible shapefiles for visualization in tools like QGIS and ArcGIS. Validation is conducted using historical flood records and data from the Global Flood Monitoring Portal. Findings demonstrate that automated SAR-based flood detection significantly enhances speed and accuracy in identifying flood extents compared to manual methods, making the approach ideal for supporting emergency response and resource allocation. By integrating these outputs into platforms like Arkly, the project delivers an accessible, user-friendly tool for disaster response teams and policymakers to assess real-time flood impacts and develop long-term resilience strategies. This research underscores the critical role of remote sensing and AI in addressing climate-related disasters and improving community preparedness in flood-prone regions.

Project Objectives

1. Develop a pipeline to process satellite imagery for flood detection: Using available satellite images, the team will develop a pipeline in Python that will ingest pre- and post-flood satellite data and run it through a machine learning model that will classify areas in the image that have flooded.

2. Generate GIS-compatible flood extent shapefiles: After completion of the classification algorithm, we need to use the latitude and longitude coordinates from flooded areas to compose a GIS-compatible shape file. The team will also provide GIS mapping visuals of the flood extent for easy readability by users.

3. Evaluate accuracy against ground-truth data where available: Evaluate the accuracy of the proposed classification algorithm using historical flood information. As the model is fed using unlabeled satellite imagery, we need to ensure the accuracy of the model by testing it on previous floods not used in model training.

4. Create a user-friendly workflow for disaster response teams: Create a user-friendly workflow that disaster response teams will be able to follow and understand when needing to map flood extent. The workflow and solution should be able to be used by anyone, no matter their knowledge of satellite images or GIS tools.

Summary of Findings

This capstone project focused on developing an automated system to detect and map post-storm flooding in Puerto Rico using Sentinel-1 Synthetic Aperture Radar (SAR) imagery. Given the region’s vulnerability to hurricanes and the limitations of optical satellite imagery in post-storm conditions, SAR was selected for its cloud-penetrating capability and reliability under adverse weather. The project’s goal was to generate accurate, GIS-compatible flood extent shapefiles to support rapid disaster response and long-term resilience planning.

The flood detection pipeline leveraged change detection techniques by comparing SAR images captured before and after Hurricane Maria in 2017. After preprocessing the imagery to ensure calibration consistency and reduce noise, the team applied decibel-based thresholding at -16 dB to -20 dB to isolate flooded regions. Otsu’s adaptive thresholding method was also tested to automate classification based on the distribution of backscatter values, helping to identify areas likely affected by surface water accumulation.

To improve classification accuracy and reduce misclassifications, logical refinements were incorporated. These included filters based on terrain slope (using Digital Elevation Model data) and minimum polygon area to exclude unlikely or noisy detections. One important insight during testing was that Otsu’s method tended to underestimate flood extent, with thresholds differing by an average of -0.70 dB compared to manually validated benchmarks. This suggests that future versions of the system could benefit from adaptive calibration tuned to specific terrain or storm conditions.

Final outputs were validated against aerial imagery from NOAA and compared with flood maps from the Global Flood Monitoring Portal. The results showed a classification accuracy of approximately 60% to 70%, an increase compared to previous algorithm generated models. While the model occasionally overestimated flooding in vegetated or urban areas, the pipeline proved effective and scalable. This project establishes a strong foundation for future enhancements, such as integrating Sentinel-2 optical imagery, real-time flood monitoring, and expansion to other disaster-prone regions.

Example image of flood detection algorithm in use comparing a pre-flood image, to a post-flood image with detected flooding in blue.

Abstract

Project Objectives

Summary of Findings

alforsyth@comcast.net