DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning - 3IA Côte d’Azur – Interdisciplinary Institute for Artificial Intelligence Accéder directement au contenu
Article Dans Une Revue Ecological Informatics Année : 2023

DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning

Résumé

Videos and images from camera traps are more and more used by ecologists to estimate the population of species on a territory. It is a laborious work since experts have to analyse massive data sets manually. This takes also a lot of time to filter these videos when many of them do not contain animals or are with human presence. Fortunately, deep learning algorithms for object detection can help ecologists to identify multiple relevant species on their data and to estimate their population. In this study, we propose to go even further by using object detection model to detect, classify and count species on camera traps videos. To this end, we developed a 3-step process: (i) At the first stage, after splitting videos into images, we annotate images by associating bounding boxes to each label thanks to MegaDetector algorithm; (ii) then, we extend MegaDetector based on Faster R-CNN architecture with backbone Inception-ResNet-v2 in order to not only detect the 13 relevant classes but also to classify them; (iii) finally, we design a method to count individuals based on the maximum number of bounding boxes detected. This final stage of counting is evaluated in two different contexts: first including only detection results (i.e. comparing our predictions against the right number of individuals, no matter their true class), then an evolved version including both detection and classification results (i.e. comparing our predictions against the right number in the right class). The results obtained during the evaluation of our model on the test data set are: (i) 73,92\% mAP for classification, (ii) 96,88\% mAP for detection with a ratio Intersection-Over-Union (IoU) of 0.5 (overlapping ratio between groundtruth bounding box and the detected one), and (iii) 89,24\% mAP for detection at IoU=0.75. Highly represented classes, like humans, have highest values of mAP around 81\% whereas less represented classes in the train data set, such as dogs, have lowest values of mAP around 66\%. Regarding the proposed counting method, we predicted a count either exact or $\pm$ 1 unit for 87\% with detection results and for 48\% with detection and classification results of our test data set. Our model is also able to detect empty videos. To the best of our knowledge, this is the first study in France about the use of object detection model on a French national park to locate, identify and estimate the population of species from camera trap videos.
Fichier principal
Vignette du fichier
article_deepWild-light.pdf (2.73 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03797530 , version 1 (04-10-2022)
hal-03797530 , version 2 (04-04-2023)

Identifiants

Citer

Fanny Simões, Charles Bouveyron, Frédéric Precioso. DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning. Ecological Informatics, 2023, 75, ⟨10.1016/j.ecoinf.2023.102095⟩. ⟨hal-03797530v2⟩
357 Consultations
181 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More