mirror of
https://gitee.com/infiniflow/ragflow.git
synced 2025-12-06 15:29:03 +08:00
### What problem does this PR solve? Feature: This PR implements a comprehensive RAG evaluation framework to address issue #11656. **Problem**: Developers using RAGFlow lack systematic ways to measure RAG accuracy and quality. They cannot objectively answer: 1. Are RAG results truly accurate? 2. How should configurations be adjusted to improve quality? 3. How to maintain and improve RAG performance over time? **Solution**: This PR adds a complete evaluation system with: - **Dataset & test case management** - Create ground truth datasets with questions and expected answers - **Automated evaluation** - Run RAG pipeline on test cases and compute metrics - **Comprehensive metrics** - Precision, recall, F1 score, MRR, hit rate for retrieval quality - **Smart recommendations** - Analyze results and suggest specific configuration improvements (e.g., "increase top_k", "enable reranking") - **20+ REST API endpoints** - Full CRUD operations for datasets, test cases, and evaluation runs **Impact**: Enables developers to objectively measure RAG quality, identify issues, and systematically improve their RAG systems through data-driven configuration tuning. ### Type of change - [x] New Feature (non-breaking change which adds functionality)