Skip to main content

Introducing LLMOps Eval — Open Source LLM Evaluation Platform

· One min read
Creator of LLMOps Eval

We're excited to announce the open source release of LLMOps Eval — a production-grade LLM/RAG evaluation platform.

Why We Built This

Every team building LLM applications faces the same challenge: how do you know if your model is good enough?

Most end up spending weeks building custom evaluation frameworks — reinventing the same infrastructure over and over. LLMOps Eval solves this with a no-code platform that lets you evaluate LLM and RAG applications through a simple UI.

What's Included

  • 20+ built-in metrics — BLEU, ROUGE, BERTScore, Faithfulness, Context Relevance, LLM-as-Judge
  • Multi-provider support — OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI
  • RAG evaluation — Purpose-built metrics powered by RAGAS
  • CI/CD integration — Trigger evaluations from GitHub Actions or GitLab CI
  • Multi-tenant — Organizations, teams, and role-based access control

Get Started

git clone https://github.com/ashwithpoojary98/llmops-eval.git
cd llmops-eval
docker-compose up -d

Visit http://localhost:3000 and run your first evaluation.

Open Source

LLMOps Eval is licensed under Apache 2.0 — free to use, modify, and distribute.

Star us on GitHub and join the community!