Skip to main content

About Wren AI and Our Mission

Wren AI is dedicated to reimagining how businesses can interact with and leverage their data through LLM, by bringing comprehension capabilities to small and large data teams.

Why Now?

In the rapidly evolving data landscape, data analysts play a pivotal role as the vital bridge between the data and the diverse business contexts within an organization. Different business units, each with unique perspectives and requirements, often seek specific insights from data, making the role of data analysts both critical and challenging. Their ability to interpret, translate, and communicate data in a way that aligns with the distinct needs of various stakeholders is indispensable.

The advent of advanced technologies such as Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) is revolutionizing this space by augmenting the capabilities of business data analysts. RAG further enhances this process by integrating retrieved external information, enabling LLMs to generate more comprehensive and accurate information.

With their understanding of context and natural language processing abilities, LLMs with RAG empower analysts to navigate and interpret vast datasets efficiently and in nuance.

Challenges of Using RAG with LLMs to Query Database

Using RAG coupled with LLMs to query databases is not a new concept. Many solutions have been proposed to tackle this problem, but they still face challenges in four crucial phases: context collection, retrieval, SQL generation, and collaboration. Despite the attempts to overcome these challenges, we are still grappling with them in all four phases.

challenges

Phase 1: Context Collection Challenges

  • Interoperability Across Diverse Sources: To generalize and normalize searched and integrated information seamlessly across varied sources, metadata services, and APIs.
  • Complex Linking of Data and Metadata: This involves associating data with its metadata in a document store. It involves storing metadata, schema, and context, such as relationships, calculations, and aggregations.

Phase 2: Retrieval Challenges

  • Optimization of Vector Stores: Developing and implementing optimization techniques for vector stores, such as indexing and chunking, are critical for enhancing search efficiency and precision.
  • Precision in Semantic Search: The challenge lies in the nuances of comprehension of queries in the context, which can significantly affect the accuracy of the results. This usually involves techniques such as query rewrite, re-ranker, etc.

Phase 3: SQL Generation Challenges

  • Accuracy and Executability of SQL Queries: Generating SQL queries that are both accurate and executable poses a significant challenge. This requires the LLM to have an in-depth understanding of SQL syntax, database schemas, and the specific dialects of different database systems.
  • Adaptation to Query Engine Dialects: Databases often have unique dialects and nuances in SQL implementation. Designing LLMs that can adapt to these differences and generate compatible queries across various systems adds another layer of complexity to the challenge.

Phase 4: Collaboration Challenges

  • Collective Knowledge Accumulation: The challenge lies in creating a mechanism that can effectively gather, integrate, and utilize the collective insights and feedback from a diverse user base to enhance the accuracy and relevance of the data retrieved by LLM.
  • Access Control: While we are finally retrieving the data, the next most important challenge is ensuring that the existing organizational data access policies and privacy regulations also apply to the new LLM and RAG architecture.

Introducing Wren AI

We have some core design philosophies that were used when developing Wren AI.

  1. Indexing With Semantics

Wren AI has implemented a semantic engine architecture to provide the LLM context of your business; you can easily establish a logical presentation layer on your data schema that helps LLM learn more about your business context.

  1. Augment LLM Prompts

With Wren AI, you can process metadata, schema, terminology, data relationships, and the logic behind calculations and aggregations with “Modeling Definition Language” (MDL), reducing duplicate coding and simplifying data joins.

  1. Generate Insights

When starting a new conversation in Wren AI, your question is used to find the most relevant tables. From these, LLM generates three relevant questions for the user to choose from. You can also ask follow-up questions to get deeper insights.

  1. Self-Learning Feedback Loop (Coming Soon)

The AI self-learning feedback loop is designed to refine SQL augmentation and generation by collecting data from various sources. These include user query history, revision intentions, feedback, schema patterns, semantics enhancement, and query frequency.

What's Next?

Moving forward, while we're working on implementations of Wren AI, we also have some ideas on areas we could explore and improve in the near future.

  • Proactive Collaboration:
    • AI agents not only report progress in real time but actively seek feedback.
    • Agents ask human advisors (e.g., senior data analysts and data engineers) for guidance to clarify complex definitions or knowledge gaps.
  • Collective Knowledge Sharing:
    • Facilitates an ecosystem where insights and learning are shared, not siloed, across the organization, enriching the AI's knowledge base.