In this paper, we present WildlifeDatasets – an opensource toolkit intended primarily for ecologists and computer-vision / machine-learning researchers. The WildlifeDatasets is written in Python, allows straightforward access to publicly available wildlife datasets, and provides a wide variety of methods for dataset pre-processing, performance analysis, and model fine-tuning. We showcase the toolkit in various scenarios and baseline experiments, including, to the best of our knowledge, the most comprehensive experimental comparison of datasets and methods for wildlife re-identification, including both local descriptors and deep learning approaches. Furthermore, we provide the first-ever foundation model for individual re-identification within a wide range of species – MegaDescriptor – that provides state-of-the-art performance on animal re-identification datasets and outperforms other pretrained models such as CLIP and DINOv2 by a significant margin. To make the model available to the general public and to allow easy integration with any existing wildlife monitoring applications, we provide multiple MegaDescriptor flavors (i.e., Small, Medium, and Large) through the HuggingFace hub.
Animal re-identification is essential for studying different aspects of wildlife, like population monitoring, movements, behavioral studies, and wildlife management [39,45, 50]. While the precise definition and approaches to animal re-identification may vary in the literature, the objective remains consistent. The main goal is to accurately and efficiently recognize individual animals within one species based on their unique characteristics, e.g., markings, patterns, or other distinctive features.
Automatizing the identification and tracking of individual animals enables the collection of precise and extensive data on population dynamics, migration patterns, habitat usage, and behavior, facilitating researchers in monitoring movements, evaluating population sizes, and observing demographic shifts. This invaluable information contributes to a deeper comprehension of species dynamics, identifying biodiversity threats, and developing conservation strategies grounded in evidence.
Similarly, the increasing sizes of the collected data and the increasing demand for manual (i.e., time-consuming) processing of the data highlighted the need for automated methods to reduce labor-intensive human supervision in individual animal identification. As a result, a large number of automatic re-identification datasets and methods have been developed, covering several animal groups like primates [23, 54], carnivores [18, 31, 48], reptiles [4, 21], whales [1, 2, 13], and mammals [3, 47, 57].
However, there is a lack of standardization in algorithmic procedures, evaluation metrics, and dataset utilization across the literature. This hampers the comparability and reproducibility of results, hindering the progress of the field. It is, therefore, essential to categorize and re-evaluate general re-identification approaches, connect them to realworld scenarios, and provide recommendations for appropriate algorithmic setups in specific contexts. By quantitatively assessing the approaches employed in various studies, we aim to identify trends and provide insights into the most effective techniques for different scenarios. This analysis will aid researchers and practitioners in selecting suitable algorithms for their specific re-identification needs, ultimately advancing the field of animal re-identification and its applications in wildlife conservation and research.
To address these issues, we have developed an opensource toolkit – WildlifeDatasets – intended primarily for ecologists and computer-vision / machine-learning researchers. In this paper, besides the short description of the main features of our tool, (i) we list all publicly available wildlife re-identification datasets, (ii) perform the largest experimental comparison of datasets and wildlife reidentification methods, (iii) describe a foundation model – MegaDescriptor – based on different Swin architectures and trained on a newly comprised dataset, and (iv) provide a variety of pre-trained models on a HuggingFace hub.