目录

Vanna 工作原理

使用检索增强来帮助您使用 LLM 为数据库生成准确的 SQL 查询。

Vanna 的工作过程分为两个简单步骤 - 在您的数据上训练 RAG“模型”,然后提出问题,这些问题将返回 SQL 查询,这些查询可以设置为在您的数据库上自动运行。

  • vn.train(…)

    在您的数据上训练 RAG“模型”。这些方法将添加到参考语料库。

  • vn.ask(…)

    问问题。这将使用参考语料库生成可以在您的数据库上运行的 SQL 查询。

例子

与您的 SQL 数据库聊天 📊。通过 RAG 使用 LLM 实现准确的文本到 SQL 生成 🔄。

ChromaDB & Ollama

from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore


class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)


vn = MyVanna(config={'model': 'qwen2:7b'})
vn.connect_to_sqlite('https://vanna.ai/Chinook.sqlite')

# Training
df_ddl = vn.run_sql("SELECT type, sql FROM sqlite_master WHERE sql is not null")

for ddl in df_ddl['sql'].to_list():
  vn.train(ddl=ddl)

# The following are methods for adding training data. Make sure you modify the examples to match your database.

# DDL statements are powerful because they specify table names, colume names, types, and potentially relationships
vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS my-table (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        age INT
    )
""")

# Sometimes you may want to add documentation about your business terminology or definitions.
vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")

# You can also add SQL queries to your training data. This is useful if you have some queries already laying around. You can just copy and paste those from your editor to begin generating new SQL.
vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")
# At any time you can inspect what training data the package is able to reference
training_data = vn.get_training_data()

# You can remove training data if there's obsolete/incorrect information. 
vn.remove_training_data(id='1-ddl')

# ```## Asking the AI
# Whenever you ask a new question, it will find the 10 most relevant pieces of training data and use it as part of the LLM prompt to generate the SQL.
# ```python
vn.ask(question=...)

启动 UI

Launch the User Interface
from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()

Qdrant & Ollama

from vanna.ollama import Ollama
from vanna.qdrant import Qdrant_VectorStore
from qdrant_client import QdrantClient


class MyVanna(Qdrant_VectorStore, Ollama):
    def __init__(self, config=None):
        Qdrant_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)


qdrant_memory_client = QdrantClient(":memory:")

vn = MyVanna(
    config={
        'client': qdrant_memory_client, 
        'fastembed_model': 'BAAI/bge-small-zh-v1.5',
        'model': 'qwen2:7b'
    }
)

vn.connect_to_sqlite('https://vanna.ai/Chinook.sqlite')

df_ddl = vn.run_sql("SELECT type, sql FROM sqlite_master WHERE sql is not null")
for ddl in df_ddl['sql'].to_list():
    vn.train(ddl=ddl)

from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()