RAGU
Описание
Codebase for graph-rag implementation
Языки
- Python100%
RAGU: Retrieval-Augmented Graph Utility
Install | Quickstart
Overview
RAGU provides a pipeline for building a Knowledge Graph, and performing retrieve over the indexed data. It contains different approaches to extract structured data from raw texts to enable efficient question-answering over structured knowledge.
Partially based on nano-graphrag
Our huggingface community is here
Install
Better way is a local build:
From pypi:
If you want to use local models (via transformers etc.), run:
Quickstart
Simple example of building knowledge graph
If you run the code with a storage folder that already contains a knowledge graph, RAGU will automatically load the existing graph.
Example of querying
Local search Search over entities retrieved for the query and their connected context (relations, summaries, and chunks).
Global search
Give an answer by community summaries.
Naive search (vector RAG):
Query planning wrapper
Decomposes complex questions into dependent subqueries, executes them in order, and uses intermediate answers to produce a final response.
Advanced Configuration
Builder Settings
Configure the knowledge graph building pipeline using :
Knowledge Graph Construction
Each text in corpus is processed to extract structured information. It consist of:
- Entities — textual representation, entity type, and a contextual description.
- Relations — textual description of the link between two entities (or a relation class), as well as its confidence/strength.
RAGU uses entity and relation classes from NEREL.
Entity types
| No. | Entity type | No. | Entity type | No. | Entity type |
|---|---|---|---|---|---|
| 1. | AGE | 11. | FAMILY | 21. | PENALTY |
| 2. | AWARD | 12. | IDEOLOGY | 22. | PERCENT |
| 3. | CITY | 13. | LANGUAGE | 23. | PERSON |
| 4. | COUNTRY | 14. | LAW | 24. | PRODUCT |
| 5. | CRIME | 15. | LOCATION | 25. | PROFESSION |
| 6. | DATE | 16. | MONEY | 26. | RELIGION |
| 7. | DISEASE | 17. | NATIONALITY | 27. | STATE_OR_PROV |
| 8. | DISTRICT | 18. | NUMBER | 28. | TIME |
| 9. | EVENT | 19. | ORDINAL | 29. | WORK_OF_ART |
| 10. | FACILITY | 20. | ORGANIZATION |
Relation types
| No. | Relation type | No. | Relation type | No. | Relation type |
|---|---|---|---|---|---|
| 1. | ABBREVIATION | 18. | HEADQUARTERED_IN | 35. | PLACE_RESIDES_IN |
| 2. | AGE_DIED_AT | 19. | IDEOLOGY_OF | 36. | POINT_IN_TIME |
| 3. | AGE_IS | 20. | INANIMATE_INVOLVED | 37. | PRICE_OF |
| 4. | AGENT | 21. | INCOME | 38. | PRODUCES |
| 5. | ALTERNATIVE_NAME | 22. | KNOWS | 39. | RELATIVE |
| 6. | AWARDED_WITH | 23. | LOCATED_IN | 40. | RELIGION_OF |
| 7. | CAUSE_OF_DEATH | 24. | MEDICAL_CONDITION | 41. | SCHOOLS_ATTENDED |
| 8. | CONVICTED_OF | 25. | MEMBER_OF | 42. | SIBLING |
| 9. | DATE_DEFUNCT_IN | 26. | ORGANIZES | 43. | SPOUSE |
| 10. | DATE_FOUNDED_IN | 27. | ORIGINS_FROM | 44. | START_TIME |
| 11. | DATE_OF_BIRTH | 28. | OWNER_OF | 45. | SUBEVENT_OF |
| 12. | DATE_OF_CREATION | 29. | PARENT_OF | 46. | SUBORDINATE_OF |
| 13. | DATE_OF_DEATH | 30. | PART_OF | 47. | TAKES_PLACE_IN |
| 14. | END_TIME | 31. | PARTICIPANT_IN | 48. | WORKPLACE |
| 15. | EXPENDITURE | 32. | PENALIZED_AS | 49. | WORKS_AS |
| 16. | FOUNDED_BY | 33. | PLACE_OF_BIRTH | ||
| 17. | HAS_CAUSE | 34. | PLACE_OF_DEATH |
How it is extracted:
1. Default Pipeline
File: ragu/triplet/llm_artifact_extractor.py. A baseline pipeline that uses LLM to extract entities, relations, and their descriptions in a single step.
2. RAGU-lm (for russian language)
A compact model (Qwen-3-0.6B) fine-tuned on the NEREL dataset. The pipeline operates in several stages:
- Extract unnormalized entities from text.
- Normalize entities into canonical forms.
- Generate entity descriptions.
- Extract relations based on the inner product between entities.
Comparison
| Model | Dataset | F1 (Entities) | F1 (Relations) |
|---|---|---|---|
| Qwen-2.5-14B-Instruct | NEREL | 0.32 | 0.69 |
| RAGU-lm (Qwen-3-0.6B) | NEREL | 0.6 | 0.71 |
| Small-model pipeline | NEREL | 0.74 | 0.75 |
Prompt Customization
All RAGU components that use LLMs inherit from , which provides methods to view and update prompts.
Viewing Current Prompts
Updating Prompts
You can customize prompts by creating a new with your own messages:
Contributors
Main Idea & Inspiration
- Ivan Bondarenko - idea, smart_chunker, NER model, ragu-lm
Core Development
- Mikhail Komarov
Benchmarks & Evaluation
- Roman Shuvalov
- Yanya Dement'yeva
- Alexandr Kuleshevskiy
- Nikita Kukuzey
- Stanislav Shtuka
Small Models Pipeline
- Matvey Solovyev
- Ilya Myznikov