gigaevo-platform
Описание
A machine learning experiment management system with a microservices architecture, featuring Kafka-based messaging and three-tier service separation.
Языки
- Python96,3%
- Shell2,4%
- Makefile1%
- Dockerfile0,3%
GigaEvo Platform
A machine learning experiment management system with a microservices architecture, featuring Kafka-based messaging and three-tier service separation.
🏗️ Architecture Overview
GigaEvo Platform consists of three main components:
🔧 Master API (Port 8000)
- Role: Experiment orchestration and coordination
- Technology: FastAPI, Kafka, PostgreSQL, Redis
- Features:
- Kafka integration for async messaging
- Experiment lifecycle management
- Configuration storage and retrieval
- uv-based dependency management
🏃 Runner API (Port 8001)
- Role: Task execution with GigaEvolve integration
- Technology: FastAPI, GigaEvolve tools
- Features:
- Experiment code execution
- Results visualization
- Best program extraction
- Background task processing
🌐 WebUI (Port 7860)
- Role: Gradio-based user interface
- Technology: Gradio, Plotly, Requests
- Features:
- Interactive experiment creation
- Real-time progress monitoring
- Results visualization
- System status dashboard
🚀 Quick Start
Prerequisites
- Docker & Docker Compose
- Python 3.12+ (for local development)
- uv (recommended) or pip
LLM configuration
GigaEvo platform reads all LLM settings from a single repo-level file: . Create from the template and fill in your credentials.
Using the Deployment System
GigaEvo Platform uses the deploy.sh script with Docker Compose for service orchestration:
1. Deploy Everything (Recommended)
This will deploy with automated health checks:
- Infrastructure: PostgreSQL, Kafka, Zookeeper, Redis (2 instances), MinIO
- Applications: Master API, Runner API, WebUI
- Networking: Docker network and shared volumes
- Health Monitoring: Automatic service health verification
2. Deploy Development Environment
3. Individual Service Development
4. Service Management
Access Points
- WebUI: http://localhost:7860
- Master API: http://localhost:8000
- Runner API: http://localhost:8001
- MinIO Console: http://localhost:9001 (user: minioadmin, pass: minioadmin)
- Kafka Broker: localhost:9092
- Kafka UI: Available in dev mode at http://localhost:9000 (via
)make dev
Runner pool size configuration
By default, the platform starts with a single runner instance. To run multiple experiments in parallel, increase the runner pool size:
The system automatically generates a file with N runner services.
Runner pool instance controls (WebUI)
The WebUI “Runner Instances” tab calls the Master API () to start/stop/restart runners and fetch container logs.
With a Compose-managed runner pool (, the default in /), Master controls the already-created containers via Docker.
Requirements:
has Docker access: mountmaster-api(and ensure the container user can read/write it; otherwise run/var/run/docker.sockas root or align the socket group).master-api- Runner containers are started by Docker Compose (Master finds them via
labels; setcom.docker.compose.service=runner-api-Nif you have multiple stacks with the same service names).COMPOSE_PROJECT_NAME
Security note: mounting the Docker socket grants the container root-equivalent control over the host Docker engine.
Quick manual checks (requires the stack running):
📚 API Endpoints
Master API (as per docs/api_endpoints.md)
- Initialize experimentPOST /api/v1/experiments/- Get list of experimentsGET /api/v1/experiments/- Request statusGET /api/v1/experiments/{experiment_id}/status- Start experimentPOST /api/v1/experiments/{experiment_id}/start- Stop experimentPOST /api/v1/experiments/{experiment_id}/stop- Get resultsGET /api/v1/experiments/{experiment_id}/results
Runner API (as per docs/api_endpoints.md)
- Load experiment codePOST /api/v1/experiments/{experiment_id}/upload- Start experimentPOST /api/v1/experiments/{experiment_id}/start- Stop experimentPOST /api/v1/experiments/{experiment_id}/stop- Get execution statusGET /api/v1/experiments/{experiment_id}/status- Get visualizationGET /api/v1/experiments/{experiment_id}/visualization- Get best programGET /api/v1/experiments/{experiment_id}/best-program- Get logs (optional)GET /api/v1/experiments/{experiment_id}/logs
🔄 Kafka Topics
The system uses these Kafka topics for coordination:
- Experiment configuration receivedexperiment-config- Experiment prepared for executionexperiment-prepared- Experiment execution startedexperiment-started- Experiment execution stoppedexperiment-stopped- Runner status updatesrunner-status
🛠️ Development
Local Development Setup
Container-Based Development
Code Quality
Database Management
🐛 Troubleshooting
Common Issues
-
Port Conflicts: Ensure these ports are free:
- 5432: PostgreSQL
- 6379, 6380: Redis (2 instances)
- 7860: WebUI
- 8000: Master API
- 8001: Runner API
- 9000, 9001: MinIO
- 9092, 29092, 29093: Kafka
- 2181: Zookeeper
-
Deployment Issues:
-
Service Health Check Failures:
-
Database Connection Issues:
Environment Variables
Key environment variables for Master API:
- PostgreSQL connection stringDATABASE__URL- Kafka bootstrap serversKAFKA__BOOTSTRAP_SERVERS- Redis connection URLREDIS_URL- MinIO endpointSTORAGE__ENDPOINT_URL- MinIO access keySTORAGE__ACCESS_KEY- MinIO secret keySTORAGE__SECRET_KEY
📊 Architecture Details
Current Kafka-Based Architecture
The platform uses a modern microservices architecture with:
- Kafka Message Broker - Asynchronous service communication with topics for experiment coordination
- Separate Docker Compositions - Modular deployment with infrastructure and application services
- Health Monitoring - Automated service health checks and recovery
- Resource Isolation - Dedicated Redis instances and MinIO storage
- uv Dependency Management - Fast package installation and dependency caching
Service Orchestration
- deploy.sh: Main deployment script with health checks and service management
- docker-compose.kafka.yml: Core infrastructure services
- docker-compose.*.yml: Individual application service configurations
- Makefile: Development commands and shortcuts
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests and linting: make test && make lint
- Submit a pull request
📄 License
MIT License - see LICENSE file for details.