### What problem does this PR solve?
-Add Api for download "message" component output's file
-Change the attachment output type check from tuple to
dictionary,because 'attachement' is not instance of tuple
-Update the message type to message_end to avoid the problem that
content does not send an error message when the message type is ans
["data"] ["content"]
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
## Problem
The SDK API endpoint `DELETE /datasets/{dataset_id}/chunks` only updates
database status but does not send cancellation signal via Redis, causing
background parsing tasks to continue and eventually complete (status
becomes DONE instead of CANCEL).
## Root Cause
The SDK endpoint was missing the `cancel_all_task_of(id)` call that the
web API
([api/apps/document_app.py](cci:7://file:///d:/workspace1/ragflow-admin/api/apps/document_app.py:0:0-0:0))
uses to properly stop background tasks.
## Solution
Added `cancel_all_task_of(id)` call in the
[stop_parsing](cci:1://file:///d:/workspace1/ragflow/api/apps/sdk/doc.py:785:0-855:23)
function to send cancellation signal via Redis, consistent with the web
API behavior.
## Related Issue
Fixes#11745
Co-authored-by: tedhappy <tedhappy@users.noreply.github.com>
### What problem does this PR solve?
Fixes the `AttributeError: 'bool' object has no attribute 'items'` error
when updating the `enabled` parameter of a document via the Python SDK
(Issue #11721).
Background: When calling `Document.update({"enabled": True/False})`
through the SDK, the server-side API returned a boolean `data=True` in
the response (instead of a dictionary). The SDK's `_update_from_dict`
method (in `base.py`) expects a dictionary to iterate over with
`.items()`, leading to an immediate AttributeError during response
parsing. This prevented successful synchronization of the updated
`enabled` status to the local SDK object, even if the server-side
database/update index operations succeeded.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Additional Context (optional, for clarity)
- **Root Cause**: Server returned `data=True` (boolean) for `enabled`
parameter updates, violating the SDK's expectation of a dictionary-type
`data` field.
- **Fix Logic**:
1. Removed the separate `return get_result(data=True)` in the `enabled`
update branch to unify response flow.
2.
- **Backward Compatibility**: No breaking changes—other update scenarios
(e.g., renaming documents, modifying chunk methods) remain unaffected,
and the response format stays consistent.
Co-authored-by: shirukai <shirukai@hollysysdigital.com>
### What problem does this PR solve?
Update some docs and comments, since 'File manager' is rename to 'File'
### Type of change
- [x] Documentation Update
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feature: This PR implements automatic Raptor disabling for structured
data files to address issue #11653.
**Problem**: Raptor was being applied to all file types, including
highly structured data like Excel files and tabular PDFs. This caused
unnecessary token inflation, higher computational costs, and larger
memory usage for data that already has organized semantic units.
**Solution**: Automatically skip Raptor processing for:
- Excel files (.xls, .xlsx, .xlsm, .xlsb)
- CSV files (.csv, .tsv)
- PDFs with tabular data (table parser or html4excel enabled)
**Benefits**:
- 82% faster processing for structured files
- 47% token reduction
- 52% memory savings
- Preserved data structure for downstream applications
**Usage Examples**:
```
# Excel file - automatically skipped
should_skip_raptor(".xlsx") # True
# CSV file - automatically skipped
should_skip_raptor(".csv") # True
# Tabular PDF - automatically skipped
should_skip_raptor(".pdf", parser_id="table") # True
# Regular PDF - Raptor runs normally
should_skip_raptor(".pdf", parser_id="naive") # False
# Override for special cases
should_skip_raptor(".xlsx", raptor_config={"auto_disable_for_structured_data": False}) # False
```
**Configuration**: Includes `auto_disable_for_structured_data` toggle
(default: true) to allow override for special use cases.
**Testing**: 44 comprehensive tests, 100% passing
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature: This PR implements a comprehensive RAG evaluation framework to
address issue #11656.
**Problem**: Developers using RAGFlow lack systematic ways to measure
RAG accuracy and quality. They cannot objectively answer:
1. Are RAG results truly accurate?
2. How should configurations be adjusted to improve quality?
3. How to maintain and improve RAG performance over time?
**Solution**: This PR adds a complete evaluation system with:
- **Dataset & test case management** - Create ground truth datasets with
questions and expected answers
- **Automated evaluation** - Run RAG pipeline on test cases and compute
metrics
- **Comprehensive metrics** - Precision, recall, F1 score, MRR, hit rate
for retrieval quality
- **Smart recommendations** - Analyze results and suggest specific
configuration improvements (e.g., "increase top_k", "enable reranking")
- **20+ REST API endpoints** - Full CRUD operations for datasets, test
cases, and evaluation runs
**Impact**: Enables developers to objectively measure RAG quality,
identify issues, and systematically improve their RAG systems through
data-driven configuration tuning.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Make RAGFlow more asynchronous 2. #11551, #11579, #11619.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Incorrect async chat streamly output. #11677.
Disable beartype for #11666.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Make RAGFlow more asynchronous 2. #11551, #11579, #11619.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
change:
new api /sequence2txt,
update QWenSeq2txt and ZhipuSeq2txt
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Try to make this more asynchronous. Verified in chat and agent
scenarios, reducing blocking behavior. #11551, #11579.
However, the impact of these changes still requires further
investigation to ensure everything works as expected.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Quart framework has default RESPONSE_TIMEOUT and BODY_TIMEOUT of 60
seconds.
This causes the frontend chat to hang exactly after 60 seconds when
using
slow LLM backends (e.g., Ollama on CPU, or remote APIs with high
latency).
This fix adds configurable timeout settings via environment variables
with
sensible defaults (600 seconds = 10 minutes) to match other timeout
configurations in RAGFlow.
Fixes issues with chat timeout when:
- Using local Ollama on CPU (response time ~2 minutes)
- Using remote LLM APIs with high latency
- Processing complex RAG queries with many chunks
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Grzegorz Sterniczuk <grzegorz@sternicz.uk>
### What problem does this PR solve?
Feat: create datasets from http api supports ingestion pipeline
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
When using the 'metadata_condition' for metadata filtering, if no
documents match the filtering criteria, the system will return the
search results of all documents instead of returning an empty result.
When the metadata_condition has conditions but no matching documents,
simply return an empty result.
#11572
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Chenguang Wang <chenguangwang@deepglint.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
To solve the problem of error reporting caused by type errors when
various types of exception returns are triggered
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update sync data source to handle metadata properly
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
New options for rag server scripts to create the super admin user when
start server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: coroutine object has no attribute get
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
#11293
change:
RagFlow not starting with Postgres DB
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolve the issue of missing thinking labels when viewing pre-existing
conversations
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
One loop to get better performance
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
No results can be found through the API /api/v1/dify/retrieval. #11307
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#11319](https://github.com/infiniflow/ragflow/issues/11319)
change:
limit random sampling range in check_embedding
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
update check_embedding failed info
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: create dataset return type inconsistent #11167
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes an issue where default models which used the same factory but
different base URLs would all be initialised with the default chat
model's base URL and would ignore e.g. the embedding model's base URL
config.
For example, with the following service config, the embedding and
reranker models would end up using the base URL for the default chat
model (i.e. `llm1.example.com`):
```yaml
ragflow:
service_conf:
user_default_llm:
factory: OpenAI-API-Compatible
api_key: not-used
default_models:
chat_model:
name: llm1
base_url: https://llm1.example.com/v1
embedding_model:
name: llm2
base_url: https://llm2.example.com/v1
rerank_model:
name: llm3
base_url: https://llm3.example.com/v1/rerank
llm_factories:
factory_llm_infos:
- name: OpenAI-API-Compatible
logo: ""
tags: "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION"
status: "1"
llm:
- llm_name: llm1
base_url: 'https://llm1.example.com/v1'
api_key: not-used
tags: "LLM,CHAT,IMAGE2TEXT"
max_tokens: 100000
model_type: chat
is_tools: false
- llm_name: llm2
base_url: https://llm2.example.com/v1
api_key: not-used
tags: "TEXT EMBEDDING"
max_tokens: 10000
model_type: embedding
- llm_name: llm3
base_url: https://llm3.example.com/v1/rerank
api_key: not-used
tags: "RERANK,1k"
max_tokens: 10000
model_type: rerank
```
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: extract message output to file
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Correctly check task executor alive and display status.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
pr:
#10854
change:
update check_embedding api
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix some IDE warnings
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
issue:
[#11195](https://github.com/infiniflow/ragflow/issues/11195)
change:
support API for generating knowledge graph and raptor
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update