Introducing the Sentinel-2 Mosaic Service
How can I get multi-band satellite data for my area of interest?
Terramonitor's Sentinel-2 Mosaic Service provides automated satellite image preprocessing and delivery, enabling GIS people to focus on what counts: interpretation and analysis of data.
The Pain
Space data is complex. Building anything on top of satellite data requires
- Finding a source for satellite data
- Downloading and preprocessing the data (georeferencing, cloud detection)
- Normalizing the data (atmospheric and radiometric correction)
Moreover, handling raw satellite imagery efficiently requires a considerable amount of both random access and solid-state memory. An average laptop is simply not powerful enough to handle downloading, selecting and mosaicing tens or hundreds of satellite images. Manually selecting cloud-free images for full coverage of large areas takes valuable work hours. Not to mention automating the process to support a constantly updating service.
The Implementation
The Sentinel-2 Mosaic Service was originally created from an internal demand to be able to get analysis-ready satellite data as uniform and as quickly as possible. We have tried, tested, and seen the pain points in our daily work. With each iteration, the service has become faster, more robust, and able to handle even larger amounts of data. The mosaicing and cloud detection algorithms have been developed and fine-tuned to make the most out of industrial-grade servers in the cloud.
The Algorithm
Our approach is based on an automatic processing chain developed for Sentinel-2 data, which involves automatic acquisition, cloud detection, atmospheric correction, radiometric normalization and mosaicing.
For the selection of cloud-free areas from the Sentinel-2 time series and merging the time series of images together, we have implemented a novel supervised machine learning method using a fleet of virtual machines. The machine learning model is trained using a comprehensive global reference dataset of surface reflectance values for different spectral bands.
First, the Sentinel-2 products are read into random access memory as numpy arrays using the rasterio library in Python 3.7. Then, the merging of images is implemented using windowing and parallel computing. The windowing step splits the spectral data in the numpy arrays into windows which represent square-shaped geographical areas. The parallel computing is executed using asynchronous coroutines available in Python 3.7+. In contrast to subroutines which are always entered at one point and exited at another point, coroutines can be entered, exited, and resumed at many different points and are implemented with the async def statement in Python. In this case coroutines are used for efficient parallel merging of numpy arrays in windowed mode.
For the calibration phase, a process called relative radiometric normalization is applied. The purpose of this process is to eliminate the overall radiometric difference between the windows used in the previous step. First, the spectral data contained in each window are read into random access memory as numpy arrays using the ProcessPoolExecutor class in Python. During this set of processes, the median values of the edges of the numpy arrays are calculated for each spectral band and stored into a Python dict data structure called edge dict. The final edge dict contains median values of the cardinal points of each window for each spectral band. Second, the windows are unified using the edge medians in the edge dict. The unification and weight implementation steps are implemented using the ProcessPoolExecutor class in Python, making it possible to use several CPU cores simultaneously for parallel processing of different spectral bands.
The entire process is visualized in the video, which shows the different steps performed by the algorithm.
Contact us to get your Sentinel-2 Mosaic today!