AnimalClue

🔔News

🔥[2024-08-25] We release theDemo on Hugging Face 🤗
🔥[2024-07-29] Introducing AnimalClue on arXiv! 🚀

Introduction

Wildlife observation plays an important role in biodiversity conservation, necessitating robust methodologies for monitoring wildlife populations and interspecies interactions. Recent advances in computer vision have significantly contributed to automating fundamental wildlife observation tasks, such as animal detection and species identification. However, accurately identifying species from indirect evidence like footprints and feces remains relatively underexplored, despite its importance in contributing to wildlife monitoring. To bridge this gap, we introduce AnimalClue, the first large-scale dataset for species identification from images of indirect evidence. Our dataset consists of 159,605 bounding boxes encompassing five categories of indirect clues: footprints, feces, eggs, bones, and feathers. It covers 968 species, 200 families, and 65 orders. Each image is annotated with species-level labels, bounding boxes or segmentation masks, and fine-grained trait information, including activity patterns and habitat preferences. Unlike existing datasets primarily focused on direct visual features (e.g., animal appearances), AnimalClue presents unique challenges for classification, detection, and instance segmentation tasks due to the need for recognizing more detailed and subtle visual features. In our experiments, we extensively evaluate representative vision models and identify key challenges in animal identification from their traces.

Video

Overview

Our dataset consists of 968 species, 200 families, and 65 orders. It includes a total of 159,605 bounding boxes across five trace types:

Footprints: 18,291 boxes from 7,581 images, covering 117 species, 46 families, and 20 orders
Feces: 18,932 boxes from 6,433 images, covering 101 species, 46 families, and 21 orders
Bones: 16,553 boxes from 12,908 images, covering 269 species, 112 families, and 45 orders
Eggs: 29,434 boxes from 9,394 images, covering 283 species, 67 families, and 20 orders
Feathers: 76,395 boxes from 60,491 images, covering 555 species, 89 families, and 30 orders

The total number of bounding boxes matches that of the classification dataset, while the number of images aligns with the detection and segmentation datasets.

Comparisons with Existing Benchmarks

While direct animal identification has been extensively studied, there remains significant potential to explore indirect methods—such as identifying animals through the traces they leave behind. Our AnimalClue dataset comprises five trace types and 968 species, with a total of 159,605 bounding boxes. It supports a wide range of tasks and includes fine-grained annotations.

Comparison with previous animal tracking datasets. CLS, DET, and SEG indicate classification, detection, and instance segmentation, respectively. Our AnimalClue contains diverse species and more number of bounding boxes.

Statistics

Comparison with previous animal tracking datasets. CLS, DET, and SEG indicate classification, detection, and instance segmentation, respectively. Our AnimalClue contains diverse species and more number of bounding boxes.

Classification

Classification accuracy for all, frequent, and rare categories of animal specie.Throughout the species, family, and order categorization, Swin-B model tends to be higher accuracies on AnimalClue.

Visialization of t-SNE

Visualization of t-SNE. By using a labeled dataset specialized for observing indirect animal clues, the separability among categories have been improved. When visualized in the feature space, the categories are better distinct.

Detection

Detection results on footprints, feces, eggs, bones, and feathers datasets (mAP@50-95).

Segmentation

Segmentation results on fecess, eggs, bones, and feathers datasets (Mask mAP@50-95).

Traits Classification

Traits classification results on footprints, feces, eggs, bones, and feathers datasets (Acc/F1 score).

Visualization of detection results

Visualization of YOLOv11 detection results. The green bounding box denotes the correct detection, and the red bounding box denotes the wrong detection

Visualization of segmentation result

Visualization of YOLOv11 instance segmentation results. The green mask denotes the correct mask, and the red mask denotes the wrong mask.

BibTeX


      @article{shinoda2025animalcluerecognizinganimalstraces,
            title={AnimalClue: Recognizing Animals by their Traces}, 
            author={Risa Shinoda and Nakamasa Inoue and Iro Laina and Christian Rupprecht and Hirokatsu Kataoka},
            year={2025},
            eprint={2507.20240},
            archivePrefix={arXiv},
            primaryClass={cs.CV},
            url={https://arxiv.org/abs/2507.20240}, 
      }

AnimalClue

AnimalClue: Recognizing Animals by their Traces

ICCV 2025 Highlight

🔔News

Introduction

Video

AnimalClue

Overview

Comparisons with Existing Benchmarks

Statistics

Experiment Results

Classification

Visialization of t-SNE

Detection

Segmentation

Traits Classification

Visualization of detection results

Visualization of segmentation result

BibTeX