LOTUS Makes LLM-Powered Data Processing Fast and Easy

LOTUS is an LLM-powered query engine for processing text, documents, structured and unstructured data with AI.

Docs GitHub Discord Paper

Get Started in a Few Lines of Code

LOTUS provides an intuitive Python package and familiar Pandas-like API with LLM-powered semantic operators for advanced document processing and data analytics.

Open in Colab


        papers_df.sem_filter("the {research_paper} has an open source repo")
            .sem_topk("the {research_paper} has the most ground-breaking ideas", K=20)
            .sem_agg("summarize the papers based on their {research_paper}")

The Power of Semantic Operators

LOTUS implements the semantic operator model, a powerful and declarative programming model for AI-based document processing and data transformations.

Declarative AI-Based Programming

Specify your data and document processing logic with declarative, high-level LLM-powered operators. Then leave the rest to the query engine!

Highly Optimized LLM Execution

LOTUS automatically optimizes your LLM-powered data processing programs, for up to 400x speedups.

Seamless Data Integration

Plug into your existing database, vector database, or document store. LLM-powered semantic operators seamlessly extend the relational model, making it easy for you to leverage your structured and unstructured document data together.

LLM-Powered Document Processing Use Cases

LOTUS serves a diverse array of applications that need to process documents and data with AI. Here are some examples, each written in short & intuitive LOTUS programs.

Document Fact-Checking

LOTUS LLM-powered document processing programs reproduce and improve upon state-of-the art fact-checking accuracy pipelines on the FEVER dataset, while optimizing execution to acheive 28x speedups.

Document ETL and Classification

LOTUS acheives state-of-the art accuracy with a single semantic operator on the BioDEX dataset, which presents a complex medical document classification task. Under the hood, the LOTUS query engine automatically explores feasible execution plans to achieves 400x faster performance than the default.

Document Search and Ranking

LOTUS LLM-powered programs acheive 200% higher accuracy than state-of-the-art retrieval and re-ranking methods, while also providing query efficiency with up to 10x lower execution time than LM-based methods used by prior works.

Research Document Insights

Simple LOTUS programs process large sets of recent ArXiv papers allows you to provide summaries, and group the data based on topics, answer complex research questions.