RAGU

0

Описание

Codebase for graph-rag implementation

Языки

  • Python100%
2 месяца назад
2 месяца назад
6 месяцев назад
2 месяца назад
17 дней назад
2 месяца назад
год назад
17 дней назад
2 месяца назад
2 месяца назад
README.md

RAGU: Retrieval-Augmented Graph Utility


RAGU logo

RAGU is under the MIT license.

Install | Quickstart


Overview

RAGU provides a pipeline for building a Knowledge Graph, and performing retrieve over the indexed data. It contains different approaches to extract structured data from raw texts to enable efficient question-answering over structured knowledge.

Partially based on nano-graphrag

Our huggingface community is here


Install

Better way is a local build:

From pypi:

If you want to use local models (via transformers etc.), run:


Quickstart

Simple example of building knowledge graph

If you run the code with a storage folder that already contains a knowledge graph, RAGU will automatically load the existing graph.

Example of querying

Local search Search over entities retrieved for the query and their connected context (relations, summaries, and chunks).

Give an answer by community summaries.

Naive search (vector RAG):

Query planning wrapper

Decomposes complex questions into dependent subqueries, executes them in order, and uses intermediate answers to produce a final response.


Advanced Configuration

Builder Settings

Configure the knowledge graph building pipeline using

BuilderSettings
:


Knowledge Graph Construction

Each text in corpus is processed to extract structured information. It consist of:

  • Entities — textual representation, entity type, and a contextual description.
  • Relations — textual description of the link between two entities (or a relation class), as well as its confidence/strength.

RAGU uses entity and relation classes from NEREL.

Entity types

No.Entity typeNo.Entity typeNo.Entity type
1.AGE11.FAMILY21.PENALTY
2.AWARD12.IDEOLOGY22.PERCENT
3.CITY13.LANGUAGE23.PERSON
4.COUNTRY14.LAW24.PRODUCT
5.CRIME15.LOCATION25.PROFESSION
6.DATE16.MONEY26.RELIGION
7.DISEASE17.NATIONALITY27.STATE_OR_PROV
8.DISTRICT18.NUMBER28.TIME
9.EVENT19.ORDINAL29.WORK_OF_ART
10.FACILITY20.ORGANIZATION

Relation types

No.Relation typeNo.Relation typeNo.Relation type
1.ABBREVIATION18.HEADQUARTERED_IN35.PLACE_RESIDES_IN
2.AGE_DIED_AT19.IDEOLOGY_OF36.POINT_IN_TIME
3.AGE_IS20.INANIMATE_INVOLVED37.PRICE_OF
4.AGENT21.INCOME38.PRODUCES
5.ALTERNATIVE_NAME22.KNOWS39.RELATIVE
6.AWARDED_WITH23.LOCATED_IN40.RELIGION_OF
7.CAUSE_OF_DEATH24.MEDICAL_CONDITION41.SCHOOLS_ATTENDED
8.CONVICTED_OF25.MEMBER_OF42.SIBLING
9.DATE_DEFUNCT_IN26.ORGANIZES43.SPOUSE
10.DATE_FOUNDED_IN27.ORIGINS_FROM44.START_TIME
11.DATE_OF_BIRTH28.OWNER_OF45.SUBEVENT_OF
12.DATE_OF_CREATION29.PARENT_OF46.SUBORDINATE_OF
13.DATE_OF_DEATH30.PART_OF47.TAKES_PLACE_IN
14.END_TIME31.PARTICIPANT_IN48.WORKPLACE
15.EXPENDITURE32.PENALIZED_AS49.WORKS_AS
16.FOUNDED_BY33.PLACE_OF_BIRTH
17.HAS_CAUSE34.PLACE_OF_DEATH

How it is extracted:

1. Default Pipeline

File: ragu/triplet/llm_artifact_extractor.py. A baseline pipeline that uses LLM to extract entities, relations, and their descriptions in a single step.

2. RAGU-lm (for russian language)

A compact model (Qwen-3-0.6B) fine-tuned on the NEREL dataset. The pipeline operates in several stages:

  1. Extract unnormalized entities from text.
  2. Normalize entities into canonical forms.
  3. Generate entity descriptions.
  4. Extract relations based on the inner product between entities.

Comparison

ModelDatasetF1 (Entities)F1 (Relations)
Qwen-2.5-14B-InstructNEREL0.320.69
RAGU-lm (Qwen-3-0.6B)NEREL0.60.71
Small-model pipelineNEREL0.740.75

Prompt Customization

All RAGU components that use LLMs inherit from

RaguGenerativeModule
, which provides methods to view and update prompts.

Viewing Current Prompts

Updating Prompts

You can customize prompts by creating a new

RAGUInstruction
with your own messages:


Contributors

Main Idea & Inspiration

  • Ivan Bondarenko - idea, smart_chunker, NER model, ragu-lm

Core Development

  • Mikhail Komarov

Benchmarks & Evaluation

  • Roman Shuvalov
  • Yanya Dement'yeva
  • Alexandr Kuleshevskiy
  • Nikita Kukuzey
  • Stanislav Shtuka

Small Models Pipeline

  • Matvey Solovyev
  • Ilya Myznikov