culicidaelab
Описание
Языки
- Python82,9%
- Jupyter Notebook17,1%
CulicidaeLab 🦟
A configuration-driven Python library for advanced mosquito analysis, featuring pre-trained models for detection, segmentation, and species classification.
python library provides a robust, extensible framework designed to streamline the pipeline of mosquito image analysis. Built on a powerful configuration system, it allows researchers and developers to easily manage datasets, experiment with models, and process images for classification, detection, and segmentation tasks. library is part of the is a part of the CulicidaeLab Ecosystem.
CulicidaeLab Ecosystem Architecture
An open-source system for mosquito research and analysis includes components:
-
Data:
- Base diversity dataset (46 species, 3139 images under CC-BY-SA-4.0 license.
- Specialized derivatives: classification, detection, and segmentation datasets under CC-BY-SA-4.0 licenses.
-
Models:
- Top-1 models (see reports), used as default by
library: classification (Apache 2.0), detection (AGPL-3.0), segmentation (Apache 2.0)culicidaelab - Top-5 classification models collection with accuracy >90% for 17 mosquito species.
- Top-1 models (see reports), used as default by
-
Protocols: All training parameters and metrics available at:
-
Applications:
- Python library (AGPL-3.0) providing core ML functionality
- Web server (AGPL-3.0) hosting API services
- Mobile app (AGPL-3.0) for field use with optimized models
These components form a cohesive ecosystem where datasets used for training models that power applications, the Python library provides core functionality to the web server, and the server exposes services consumed by the mobile application. All components are openly licensed, promoting transparency and collaboration.
This integrated approach enables comprehensive mosquito research, from data collection to analysis and visualization, supporting both scientific research and public health initiatives.
Key Features of the culicidaelab library
- Configuration-Driven Workflow: Manage all settings—from file paths to model parameters—through simple YAML files. Override defaults easily for custom experiments.
- Ready-to-Use Models: Leverage pre-trained models for:
- Species Classification: Identify mosquito species using a high-accuracy classifier.
- Mosquito Detection: Localize mosquitoes in images with a YOLO-based detector.
- Instance Segmentation: Generate precise pixel-level masks with a SAM-based segmenter.
- Unified API: All predictors share a consistent interface (
,.predict(),.visualize()) for a predictable user experience..evaluate() - Automatic Resource Management: The library intelligently manages local storage, automatically downloading and caching model weights and datasets on first use.
- Extensible Provider System: Seamlessly connect to data sources. A
is built-in, with an easy-to-implement interface for adding more providers.HuggingFaceProvider - Powerful Visualization: Instantly visualize model outputs with built-in, configurable methods for drawing bounding boxes, classification labels, and segmentation masks.
Practical Applications
is more than just a set of models; it's a powerful engine for building real-world solutions. Here are some of the ways it can be applied:
-
Automation in Scientific Laboratories:
- Bulk Data Processing: Automatically analyze thousands of images from camera traps or microscopes to assess mosquito populations without manual intervention.
- Reproducible Research: Standardize the data analysis process, allowing other scientists to easily reproduce and verify research results published using the library.
-
Integration into Governmental and Commercial Systems:
- Epidemiological Surveillance: Use the library as the core "engine" for national or regional monitoring systems to track vector-borne disease risks.
- Custom Solution Development: Rapidly prototype and create specialized software products for pest control services, agro-industrial companies, or environmental organizations.
-
Advanced Analytics and Data Science:
- Geospatial Analysis: Write scripts to build disease vector distribution maps by processing geotagged images.
- Predictive Modeling: Use the library's outputs as features for larger models that forecast disease outbreaks based on vector presence and density.
Requirements
Hardware Requirements
Processor (CPU): Any modern x86-64 CPU.
Memory (RAM): Minimum 2 GB. 8 GB or more is recommended for processing large datasets or using more complex models.
Graphics Card (GPU): An NVIDIA GPU with CUDA support is highly recommended for a significant performance increase in deep learning model operations, especially for detection and segmentation but not essential for classification (see performance logs ang notebook). For the SAM model, a GPU is virtually essential for acceptable performance. Minimum video memory is 2 GB; 4 GB or more is recommended. For serve-gpu installation profile, CUDA 12.X is required, for CUDA 11.X use installation instructions below.
Hard Drive: At least 10 GB of free space to install the library, dependencies, download pre-trained models, and store processed data.
Software Requirements
Operating Systems (tested):
- Windows 10/11
- Linux 22.04+
Software:
- for Linux needed libgl1 package to be installed
- Git
- Python 3.11
- uv 0.8.13
Python packages:
- PyTorch 2.3.1+
- FastAI 2.7.0 - 2.8.0
- Ultralytics 8.3.0+
- HuggingFace Hub 0.16.0+
- Datasets 4.0.0
- Pillow 9.4.0
- Pydantic 2.0.0+
For full list of dependencies, see the pyproject.toml file.
Installation
Basic Installation
For most users, the default installation provides full inference capabilities:
This includes CPU-based inference using ONNX Runtime, giving you everything needed for fast lightweight mosquito classification inference, and includes all core functionality without heavy ML frameworks, such as configuration management, resource handling, and model downloading capabilities.
Installation Profiles
Choose an installation profile based on your use case:
For Production/Serving (Lightweight Inference)
GPU-accelerated inference:
Lightweight serve alias (equivalent to default):
For Research/Development
CPU-based development (includes PyTorch, FastAI, Ultralytics, and ONNX):
GPU-accelerated development (includes PyTorch GPU, FastAI, Ultralytics, and ONNX GPU):
Additional Options
Run example notebooks locally:
Build documentation locally:
Run tests:
Development Setup
To set up a development environment with all tools:
- Clone the repository:
- Install dependencies with
(recommended):uv
Or with :
- Set up pre-commit hooks:
This will run linters and formatters automatically on each commit to ensure code quality and consistency.
Quick Start
Here's how to classify the species of a mosquito in just a few lines of code. The library will automatically download the necessary model on the first run.
Documentation
For complete guides, tutorials, and the full API reference, visit the documentation site.
The documentation includes:
- In-depth installation and configuration guides.
- Detailed tutorials for each predictor.
- Architectural deep-dives for contributors.
- A full, auto-generated API reference.
Contributing
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
Please see our Contributing Guide for details on our code of conduct, development setup, and the pull request process.
Acknowledgments
CulicidaeLab development is supported by a grant from the Foundation for Assistance to Small Innovative Enterprises (FASIE)
License
This project is distributed under the AGPL-3.0 License. See the LICENSE file for more information.
Citation
If you use in your research, please cite it as follows:
Contact
- Issues: Please use the GitHub issue tracker.
- Email: iloncka.ds@gmail.com