Methodology

We employ a comprehensive approach to identifying brick kilns using scalable deep learning techniques and conducting a spatial analysis to understand the scale and scope of brick manufacturing. The workflow is organized into two main components: a machine learning pipeline for kiln detection and classification, and a spatial analysis to assess the environmental and regulatory implications of brick kiln distribution. We first train our model to identify brick kilns as primary sources of pollutants. We then employ transfer learning to locate and geo-tag other pollutant sources and present them on our visual map.

Machine Learning Pipeline

Kiln Detection Model Training

The model utilizes satellite imagery to train a classification model distinguishing between images containing kilns and those that do not. The data is provided in HDF5 format, with separate files for training and validation. The images are first stripped of their geospatial metadata, focusing solely on the visual content (RGB data). For the IGP, high definition satellite imagery is scraped using APIs to augment and label the training dataset.

Connected Component Analysis

After identifying true positives and classifying the images, the pipeline searches for connected components of adjacent kiln images to form signal masses. The centroids of these masses are calculated. This step is designed to extract coordinates of the midpoints of the identified brick kilns. These can then be geolocated and mapped to then compare the model’s prediction accuracy against the existing kilns.

Shape Classification Model Training

We further classify kilns into two types based on their shape: rectangular / ovular. This step involves training another model with data labeled for these two categories.

Spatial Analysis

This component utilizes the kiln locations identified by the machine learning pipeline along with various publicly available geospatial datasets (e.g., population distributions, health and education facilities, meteorological data) to analyze the impact of brick kilns. The analysis aims to understand the distribution of kilns in relation to regulated entities, environmental conditions, and population exposure to pollution.

Expected Results

The final data will include geo-identified imagery, final output of the kiln detection model, and public datasets for the spatial analysis.

Functional Framework

This framework offers a novel approach to leveraging deep learning and spatial analysis for environmental detecting and regulatory assessment. By making the methodology and data accessible, the project encourages further research and collaboration in addressing the challenges posed by brick manufacturing and its environmental impact in the IGP. For starters, we create a model to detect Brick kilns in the Indian Gangetic Plain. It operates on the assumption that satellite images of a specific region have been processed through a convolutional neural network (CNN) model, which outputs predictions indicating the presence of kilns. The code consists of several key components designed to enhance the model's output, create visualizations, and refine detections.

Mask Images Module to segregate and classify kilns

Create Connected Component Image and geo-tag kilns:

Post Process Connected Component Mask to improve visibility and robustness

Overall Workflow and Application Summary