About Wren AI and Our Mission
Wren AI is an open-source SQL AI Agent that empowers data, product, and business teams to access insights through chat, built-in well designed intuitive UI and UX, integrating seamlessly with tools like Excel and Google Sheets.
Why Now with Text-to-SQL?
In the rapidly evolving data landscape, data analysts play a pivotal role as the vital bridge between the data and the diverse business contexts within an organization. Different business units, each with unique perspectives and requirements, often seek specific insights from data, making the role of data analysts both critical and challenging. Their ability to interpret, translate, and communicate data in a way that aligns with the distinct needs of various stakeholders is indispensable.
The advent of advanced technologies such as Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) is revolutionizing this space by augmenting the capabilities of business data analysts. RAG further enhances this process by integrating retrieved external information, enabling LLMs to generate more comprehensive and accurate information.
With their understanding of context and natural language processing abilities, LLMs with RAG empower analysts to navigate and interpret vast datasets efficiently and in nuance.
Challenges of Using RAG with LLMs to Query Database
Using RAG coupled with LLMs to query databases is not a new concept. Many solutions have been proposed to tackle this problem, but they still face challenges in four crucial phases: context collection, retrieval, SQL generation, and collaboration. Despite the attempts to overcome these challenges, we are still grappling with them in all four phases.
Phase 1: Context Collection Challenges
- Interoperability Across Diverse Sources: To generalize and normalize searched and integrated information seamlessly across varied sources, metadata services, and APIs.
- Complex Linking of Data and Metadata: This involves associating data with its metadata in a document store. It involves storing metadata, schema, and context, such as relationships, calculations, and aggregations.
Phase 2: Retrieval Challenges
- Optimization of Vector Stores: Developing and implementing optimization techniques for vector stores, such as indexing and chunking, are critical for enhancing search efficiency and precision.
- Precision in Semantic Search: The challenge lies in the nuances of comprehension of queries in the context, which can significantly affect the accuracy of the results. This usually involves techniques such as query rewrite, re-ranker, etc.
Phase 3: SQL Generation Challenges
- Accuracy and Executability of SQL Queries: Generating SQL queries that are both accurate and executable poses a significant challenge. This requires the LLM to have an in-depth understanding of SQL syntax, database schemas, and the specific dialects of different database systems.
- Adaptation to Query Engine Dialects: Databases often have unique dialects and nuances in SQL implementation. Designing LLMs that can adapt to these differences and generate compatible queries across various systems adds another layer of complexity to the challenge.
Phase 4: Collaboration Challenges
- Collective Knowledge Accumulation: The challenge lies in creating a mechanism that can effectively gather, integrate, and utilize the collective insights and feedback from a diverse user base to enhance the accuracy and relevance of the data retrieved by LLM.
- Access Control: While we are finally retrieving the data, the next most important challenge is ensuring that the existing organizational data access policies and privacy regulations also apply to the new LLM and RAG architecture.
Introducing Wren AI - A Text-to-SQL total solution for data and business teams.
We have some core design philosophies that were used when developing Wren AI.
1. Talk to Your Data in Any Language
Wren AI speaks your language, such as English, German, Spanish, French, Japanese, Korean, Portuguese, Chinese, and more. Unlock valuable insights by asking your business questions to Wren AI. It goes beyond surface-level data analysis to reveal meaningful information and simplifies obtaining answers from lead scoring templates to customer segmentation.
2. Semantic Indexing with a Well-Crafted UI/UX
Wren AI has implemented a semantic engine architecture to provide the LLM context of your business; you can easily establish a logical presentation layer on your data schema that helps LLM learn more about your business context.
3. Generate SQL Queries with Context
With Wren AI, you can process metadata, schema, terminology, data relationships, and the logic behind calculations and aggregations with “Modeling Definition Language”, reducing duplicate coding and simplifying data joins.
4. Get Insights without Writing Code
When starting a new conversation in Wren AI, your question is used to find the most relevant tables. From these, LLM generates three relevant questions for the user to choose from. You can also ask follow-up questions to get deeper insights.
5. Easily Export and Visualize Your Data
Wren AI provides a seamless end-to-end workflow, enabling you to connect your data effortlessly with popular analysis tools such as Excel and Google Sheets. This way, your insights remain accessible, allowing for further analysis using the tools you know best.
What's Next?
Moving forward, while we're working on implementations of Wren AI, we also have some ideas on areas we could explore and improve in the near future.
- Proactive Collaboration:
- AI agents not only report progress in real time but actively seek feedback.
- Agents ask human advisors (e.g., senior data analysts and data engineers) for guidance to clarify complex definitions or knowledge gaps.
- Collective Knowledge Sharing:
- Facilitates an ecosystem where insights and learning are shared, not siloed, across the organization, enriching the AI's knowledge base.