Machine Learning for Object Detection in ArcGIS
Extraction of Rooftop Solar Panels from Aerial Imagery
Object detection is all about training a machine learning model to detect objects from an image. This become highly applicable in GIS because we're already used to working with satellite, aerial, and even drone imagery. Combining that imagery with object detection techniques means we can start to extract geo-located features in an automated way.
Requirements
To get started with object detection in ArcGIS, you'll need 3 things:
- ArcGIS Pro with the Image Analyst extension
- High quality aerial imagery
- High-spec physical or virtual compute hardware
Training Data
After installing and licensing ArcGIS Pro on your machine, you should add your georeferenced aerial imagery to a project and launch the "Label Objects for Deep Learning" tool to get started with developing your training data. Using the tool, create a new schema and a new class for the object type of interest i.e. Solar Panels.
Begin identifying clusters of solar panels on rooftops. Try your best to capture a variety of solar panel configurations against a range of different roof material.
As a general rule of thumb, the more training data you can develop for your project, the more accurate your finished object detection model can be.
Once you've gathered a sufficient number of training samples, run the tool to convert your imagery into labeled "image chips". These chips and their associated metadata are stored in a folder which will be accessed at a later stage by our Python code for training the model.
Python Environment
Our machine learning code requires a Python environment to run in. Luckily, ArcGIS Pro comes with an installation of Python 3. All we need to do is to replicate the default Python environment and add and update the required Python libraries for our machine learning tasks.
Simply reach into the Project menu in ArcGIS Pro and use the Manage Environments functionality to clone the default Python environment. This process will likely take a couple of minutes to complete, following which we can proceed to launch the Python Command Prompt on our machine and update the installed libraries by following the instructions found here.
Jupyter Notebook
Jupyter Notebooks are a very visual way of running Python code and will help us a lot in our learning process. ArcGIS Pro's Python environment also happens to include an installation of Jupyter Notebooks, which can be accessed by running the "jupyter notebook" statement in our Python Command Prompt window. Create a new notebook and run the following code.
First, we need to import the prepare_data module from the arcgis.learn library. We'll them import our labeled image chips which we developed by running the "Label Objects for Deep Learning" tool in ArcGIS Pro.
from arcgis.learn import prepare_data data_path = 'C:/your-path-to-image-chips-folder' data = prepare_data(data_path, {1: ' Panels'}, chip_size=448)
We can confirm the data was loaded correctly by visualizing a random batch from the training set.
data.show_batch()
Now we can prepare to train our model. We'll start by importing the Single Shot Detector (SSD) module from the arcgis.learn library. SSD takes advantage of a convolutional neural network in the back-end which is capable of recognizing basic shapes like blobs and corners. After importing SSD, we define three of its most important parameters.
- grids: since only about 2 or 3 sets of panels can seem to fit within a typical chip, we'll set this value at 3
- zooms: only one cluster of panels can fit within a 3x3 grid in our image chips, and there's typically no scaling in panel cluster sizes, therefore, we'll leave this value as 1
- ratios: the majority of the solar panel clusters in our sample data are arranged in a rectangular fashion, therefore 1:2 and 2:1 values seem like logical values for our ratios
from arcgis.learn import SingleShotDetector ssd = SingleShotDetector(data, grids=[3], zooms=[1.0], ratios=[[1.0, 2.0], [2.0, 1.0]])
Let's take a look at our SSD output prior to training. The left hand column represents our training data and the right hand column represents our untrained model's inference results. The results will seem random since we have not yet trained our model.
ssd.show_results(thresh=0.2)
Next we need to find the learning rate using the learning rate finder in arcgis.learn.
But what is the learning rate and why do we need it? In the simplest terms, deep learning neural networks are trained using an algorithm that estimates the error in the current state of the model by comparing it to examples from the training data set. To improve the model for the next iteration, the weights of the different "neural pathways" of the model are updated. The amount that those wights are updated by during training is referred to as the step size or learning rate.
It's paramount to find the right value for the learning rate. If the value is too small, the learning process can take a very long time or simply get stuck. If the value is too large then the ideal solution can be overshot.
myLR = ssd.lr_find()
Our learning rate in this scenario is approximately 0.0014. We can proceed to train the model by using the fit method. We should typically start with a smaller number of training iterations or "epochs" to discover its behavior. We can then increase the number of epochs until our validation loss starts to plateau.
ssd.fit(300, lr=myLR)
We should examine the loss function of the model for signs of over fitting. Over fitting can occur when the model is trained for more iterations than necessary and can lead to a solution that is less effective.
ssd.learn.recorder.plot_losses()
We can see that the model is over trained in this scenario. We're exhibiting a good rate of validation loss during the first 100 to 125 epochs (400 to 500 batches processed), beyond which our validation loss plateaus. Let's check on the average precision score for the model.
ssd.average_precision_score()
The average precision sits at 55.7% which is quite good for a model like this. Let's visualize the inferred results of our model for posterity. Again, the left hand column represents our training data while the right hand column is our trained model's inferred results.
ssd.show_results(rows=25, thresh=0.2)
We can see that our model's accuracy has drastically improved thanks to the training. It's is now ready to be saved and used in Pro for inferring results.
ssd.save('myPanelsModel_e300')
Detecting Objects with ArcGIS Pro
Now that we've trained our solar panel detection model, we can use the "Detect Objects Using Deep Learning" tool in ArcGIS Pro to detect all the rooftop solar panels in our aerial imagery.
The detected solar panel clusters are saved as a feature layer which can be uploaded to the ArcGIS Portal and used in a variety of data products to derive business value.
If you'd like to implement object detection capabilities like this in your own organization, feel free to reach out to me via LinkedIn or by email: [email protected]
Professor of Geophysics
2yThank you very much for posting this. It's very helpful.
Consultant at Fugro USA Marine Inc
3yThanks for the tutorial. When I look at the image chips from ArcPro, they are all the right way up, but after import to ML, as viewed via [data.show_batch()], they are randomly rotated by 90, 180 and 270 degrees. That is probably good for top-down views of houses but if we are looking for something with constant directionality (e.g., a shadow on a single photomosaic, or a car at street level) then rotating the training images seems like it would reduce model accuracy. Can we prevent rotation?
Coordinator Planning GIS, Geospatial Intelligence at NSW State Emergency Service
4yGreat presentation the other day. Thank you.
Data & Systems Strategist | Geoscientist | NFP Board member
4yBoris Rochat Craig Couper
Data Scientist chez Etat de Genève
4y..."here", where? Thanks in advance for sharing the Python code!