LlamaIndex (formerly GPT Index) is a data framework for LLM applications to ingest, structure, and access private or domain-specific data.
🚀 Why LlamaIndex?
At their core, LLMs offer a natural language interface between humans and inferred data. Widely available models come pre-trained on huge amounts of publicly available data, from Wikipedia and mailing lists to textbooks and source code.
Applications built on top of LLMs often require augmenting these models with private or domain-specific data. Unfortunately, that data can be distributed across siloed applications and data stores. It’s behind APIs, in SQL databases, or trapped in PDFs and slide decks.
That’s where LlamaIndex comes in.
🦙 How can LlamaIndex help?
LlamaIndex provides the following tools:
- Data connectors ingest your existing data from their native source and format. These could be APIs, PDFs, SQL, and (much) more.
- Data indexes structure your data in intermediate representations that are easy and performant for LLMs to consume.
- Engines provide natural language access to your data. For example:
- Query engines are powerful retrieval interfaces for knowledge-augmented output.
- Chat engines are conversational interfaces for multi-message, “back and forth” interactions with your data.
- Data agents are LLM-powered knowledge workers augmented by tools, from simple helper functions to API integrations and more.
- Application integrations tie LlamaIndex back into the rest of your ecosystem. This could be LangChain, Flask, Docker, ChatGPT, or… anything else!
👨‍👩‍👧‍👦 Who is LlamaIndex for?
LlamaIndex provides tools for beginners, advanced users, and everyone in between.
Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code.
For more complex applications, our lower-level APIs allow advanced users to customize and extend any module—data connectors, indices, retrievers, query engines, reranking modules—to fit their needs.