LightRAG

HKUDS/LightRAG β€” open-source RAG framework published at EMNLP2025 (arXiv:2410.05779). Instead of naive vector search, LightRAG extracts entities and relations into a knowledge graph, then combines dual-level retrieval (low-level β†’ specific entities, high-level β†’ broad topics/entity groups). In the authors’ evaluations it beats NaiveRAG, RQ-RAG, HyDE, and GraphRAG on agriculture, CS, legal, and mixed domains.

For my context β€” this is a reference tool for AI Chatbots Architecture and a potential retrieval layer for LLM Knowledge Bases / Brain once they outgrow markdown indexes.

Description

Download or use

# LightRAG Server (Web UI + REST API + Ollama-compatible interface)
uv tool install "lightrag-hku[api]"
cp env.example .env       # set your LLM + embedding config
lightrag-server
 
# LightRAG Core (for embedded use or research)
uv pip install lightrag-hku
 
# Docker Compose
git clone https://github.com/HKUDS/LightRAG && cd LightRAG
cp env.example .env && docker compose up

The setup wizard (make env-base, make env-storage, make env-server) generates .env interactively instead of hand-editing.

πŸ—’οΈ Description

🧩 How LightRAG differs from naive RAG

Naive RAG: chunk β†’ embed β†’ top-k cosine similarity β†’ LLM. Weakness: chunks are atomic, with no relations between them, and models struggle on questions that require understanding a whole document (cross-chunk reasoning).

LightRAG adds an entity-relationship extraction phase during indexing β€” the LLM pulls entities (people, organizations, concepts) and relations out of each document into a knowledge graph. Queries run in two modes:

  • Low-level β€” searches for specific entities
  • High-level β€” searches for broad topics / entity groups
  • Mix mode (recommended since 2025.08 with reranker default) β€” combines both

🧩 Model requirements

Much higher than naive RAG, because the LLM has to extract entity-relationships from documents:

  • LLM: β‰₯32B parameters, context β‰₯32KB (64KB recommended), don’t use reasoning models for indexing, but do use stronger ones for query
  • Embedding: must be multilingual, e.g. BAAI/bge-m3 or text-embedding-3-large. Critical: the same model for index and query β€” switching means wiping vector tables
  • Reranker: BAAI/bge-reranker-v2-m3 or Jina; enabling it materially improves retrieval

🧩 Storage backends

Supports unified storage for all four components (KV, vector, graph, doc-status):

  • MongoDB (since 2025.02)
  • PostgreSQL (since 2025.01)
  • OpenSearch (since 2026.03)
  • Neo4j (graph storage since 2024.11)

🧩 HKUDS family ecosystem

ProjectWhat it adds
LightRAGBase text RAG with KG
RAG-AnythingMultimodal β€” PDF, Office docs, images, tables, formulas
VideoRAGExtreme long-context video RAG
MiniRAGSimplified RAG for small models

Since 2025.06 LightRAG integrates RAG-Anything for multimodal pipelines.

🧩 Observability and evaluation

Since 2025.11:

  • Langfuse integration β€” tracing
  • RAGAS β€” evaluation with context precision metrics
  • API returns retrieved contexts alongside query results
  • Token usage tracking, KG export, LLM cache management

🧩 Paper results (LightRAG vs baseline across 4 domains)

BaselineAgricultureCSLegalMix
vs NaiveRAG67.6%61.6%83.6%61.2%
vs RQ-RAG68.4%61.2%84.8%60.8%
vs HyDE74.0%58.4%73.2%59.6%
vs GraphRAG54.4%51.6%51.6%49.6%

(Comprehensiveness β€” % LightRAG win rate over the baseline.) Vs GraphRAG the margin is slim, vs the rest solid.

✍️ Reasoning for

My use case #1 is a potential retrieval layer for Brain once the content/ folder grows past the level where grep + markdown indexes are enough. Today the agent (i.e. me) uses progressive disclosure via _indexes/vault-map.md β†’ catalog.md β†’ graph.md β€” that works up to ~500 notes. Beyond that I’ll want semantic search with KG awareness, and LightRAG looks like a reasonable foundation.

Use case #2: Qamera AI / AI Chatbots Architecture β€” chatbots where context spans many docs and naive vector search misses relations. There mix mode + reranker default should noticeably improve quality.

Weak points:

  • Model requirements (β‰₯32B, 32KB context) rule out cheap embedding on small OSS LLMs
  • Embedding model lock-in (changing it = full reindex) β€” an expensive mistake
  • Indexing time grows linearly with size (LLM extracts entities per document)

Alternatives considered

  • GraphRAG (Microsoft) β€” similar idea with KG; per the paper LightRAG is marginally better and lighter
  • HyDE β€” generate a hypothetical answer, embed it, retrieve. Works, but in the paper it tanks on agriculture/legal
  • Naive RAG (BAAI/bge-m3 + simple top-k) β€” enough for 80% of use cases, simpler, cheaper
  • MiniRAG β€” same family, for small models
  • Graphify β€” code/docs β†’ queryable KG, but that’s a skill, not a full RAG framework

πŸ”— Resources

  • Citation: @article{guo2024lightrag, eprint={2410.05779}, primaryClass={cs.IR}, year={2024}}
  • Setup wizard docs: docs/InteractiveSetup.md (in repo)
  • Programming guide: docs/ProgramingWithCore.md
  • Offline deployment guide: docs/OfflineDeployment.md
  • Reproduce findings: docs/Reproduce.md

Template: tool