Counting Trees with drones and Artificial Intelligence

by Francesco Sambalino, Eefje Visser, Kim de Groot
September 8, 2021

Using a drone, we mapped ±4000 hectares of farmland in central Tanzania. Farmers are increasingly keeping more trees on their land, and we want to estimate the change. How do we do that?

There are multiple ways to approach these research questions and over the years, we tried three of them: detecting trees using their height profile, pixel-based supervised classification, image recognition.

Trees = micro-reliefs

To create high-resolution maps, we used specialized software such as drone deploy, SimActive, and Agisoft Metashape. Each point on the farmland is captured by multiple overlapping pictures as the drone flies on a predefined track. The software recognizes each unique point in multiple shots, aligns the images, and stitch them together to create one big map out of hundreds of pictures. There is more to that.

In a process called stereophotogrammetry, the software also calculates the difference in height between each point mapped. The result is a high-resolution Digital Surface Model (DSM[1]) of the area. A DSM of flat farmland will show a lot of small bumps for every tree in the scene. How to measure tree cover and tree height starting from this?

The tools can be borrowed from GIS hydrological toolboxes and can be run using QGIS. However, this method turned impractical when we realized we didn’t have enough computing power, and the cost would become excessive when running the same process over 3900 hectares (split into 39 maps).

To the cloud

In the second year, we decided to move to the cloud by using Google Earth Engine.

In Google Earth Engine, supervised classification algorithms may be adapted for identifying different land cover types. In our case, we wanted to discern trees from all other land cover types.

It is a form of machine learning, whereby you need to show the “Machine” several pixels where you are sure there are trees. After this training phase, the machine tries to recognize whether each pixel in the map is a tree or not.

The end result was pretty good, but once again, the time needed to refine each of the 39 maps was too much. For each map, we needed to train the machine from scratch. On top of that, after running the classification, we had to export the resulting file to QGIS to run the statistics needed.

Detectron2 to the rescue

Ever wondered how social media recognize your face across pictures and tag you accordingly? How does a self-driving car remember items while driving and change its course? This is all powered by Artificial Intelligence (AI). It is a world in full expansion and there are many open toolboxes freely available. One of them is the Detectron2 framework developed by Facebook. It doesn’t only look at every pixel in a map but also look at how an ensemble of pixels ends up creating complex shapes.

We decided to give it a try as we saw some practical advantages over our previous approach. We could still run the analysis on the cloud.  However, the most crucial advantage is using the same training set for all 39 available maps.

Detectron2 is a state-of-the-art framework for object detection and segmentation. Exactly what we needed. Again, we want to identify trees (object detection), but we also want their exact perimeter (segmentation) to calculate their crown diameter and area.

To run, Detectron2 also needs a training set. Due to the high amount of labeled images required for the training, we hired some students to help us. In two weeks, they were able to label thousands of trees.

After running the algorithm, we are finally able to identify trees and delineate them with satisfying accuracy. The identified trees have their perimeter traced. We found our trees!

In the final step, we use GIS for some descriptive statistics that help us count trees and estimate the area covered and their average diameter. To make the process more elegant, we are automating the process with Geopandas (a Python library for geospatial data handling).

[1] A DSM doesn’t portray the terrain altitude but the height of the highest object detected. If there is a tree, it will provide its height. If the land is bare instead, the height provided will be the height of the soil surface.

“This Blog tells the story of the joint efforts of MetaMeta Research and Lynxx to help Justdiggit and LEAD foundation to monitor the impact of its regreening program in central Tanzania (Regreening Dodoma Program)” 

September 8, 2021  
Produced by