Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
llmragllmopsprompt-engineeringtestingpromptsevaluation-frameworkevaluationllm-evalcicdci-cdcillm-evaluationllm-evaluation-frameworkprompt-testing- TypeScript
01Обновлено 6 месяцев назад
An Open-Source Framework for Prompt-Learning.
ainlppytorchnatural-language-processingdeep-learningtransformernlp-librarynatural-language-understandingpromptsnlp-machine-learningpre-trained-language-modelspre-trained-modelpromptprompt-based-tuningprompt-learningprompt-toolkit- Python
00Обновлено 8 месяцев назад
The production toolkit for LLMs. Observability, prompt management and evaluations.
- TypeScript
00Обновлено 7 месяцев назад