Introducing LLMOps Eval — Open Source LLM Evaluation Platform
· One min read
We're excited to announce the open source release of LLMOps Eval — a production-grade LLM/RAG evaluation platform.
Why We Built This
Every team building LLM applications faces the same challenge: how do you know if your model is good enough?
Most end up spending weeks building custom evaluation frameworks — reinventing the same infrastructure over and over. LLMOps Eval solves this with a no-code platform that lets you evaluate LLM and RAG applications through a simple UI.
What's Included
- 20+ built-in metrics — BLEU, ROUGE, BERTScore, Faithfulness, Context Relevance, LLM-as-Judge
- Multi-provider support — OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI
- RAG evaluation — Purpose-built metrics powered by RAGAS
- CI/CD integration — Trigger evaluations from GitHub Actions or GitLab CI
- Multi-tenant — Organizations, teams, and role-based access control
Get Started
git clone https://github.com/ashwithpoojary98/llmops-eval.git
cd llmops-eval
docker-compose up -d
Visit http://localhost:3000 and run your first evaluation.
Open Source
LLMOps Eval is licensed under Apache 2.0 — free to use, modify, and distribute.
⭐ Star us on GitHub and join the community!