【LlamaIndex教程】2. 存储模块：如何在 LlamaIndex 中使用自定义的向量数据库？（附代码）

发布日期：2025-03-07 13:33 点击次数：151

前面文章两行代码就实现了文档的切分和向量化存储以及持久化存储。如果我们想用自定义的向量化数据库呢？

0. 背景

前面文章两行代码就实现了文档的切分和向量化存储以及持久化存储。

index = VectorStoreIndex.from_documents(documents)# store it for laterindex.storage_context.persist(persist_dir=PERSIST_DIR)

但是有时候我们更希望使用自己常用的向量数据库和向量化方式。下面以 chromadb 为例，介绍如何使用。

1. 在 LlamaIndex 中使用自定义的向量数据库

（1）环境准备

写代码之前，需要首先安装 LlamaIndex 中的 chromadb。

pip install -U llama-index-vector-stores-chroma -i https://pypi.tuna.tsinghua.edu.cn/simple

（2）创建一个chromadb 数据库的实例

db = chromadb.PersistentClient(path="D:\\GitHub\\LEARN_LLM\\LlamaIndex\\vector_store\\chroma_db")

（3）创建 chroma 数据库的 collection

chroma_collection = db.get_or_create_collection("quickstart")

（4）将 chroma_collection 使用 LlamaIndex 的 ChromaVectorStore 进行以下类型转换和封装，转换成 LlamaIndex 的 VectorStore。

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

（5）将 VectorStore 封装到 StorageContext 中

storage_context = StorageContext.from_defaults(vector_store=vector_store)

（6）创建 VectorStoreIndex 时，使用 from_documents 函数中的 storage_context 参数，将上面自定义的 storage_context 传入。

index = VectorStoreIndex.from_documents(    documents, storage_context=storage_context)

完整代码如下：

import chromadbfrom llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom llama_index.vector_stores.chroma import ChromaVectorStorefrom llama_index.core import StorageContext# load some documentsdocuments = SimpleDirectoryReader("D:\\GitHub\\LEARN_LLM\\LlamaIndex\\data").load_data()# initialize client, setting path to save datadb = chromadb.PersistentClient(path="D:\\GitHub\\LEARN_LLM\\LlamaIndex\\vector_store\\chroma_db")# create collectionchroma_collection = db.get_or_create_collection("quickstart")# assign chroma as the vector_store to the contextvector_store = ChromaVectorStore(chroma_collection=chroma_collection)storage_context = StorageContext.from_defaults(vector_store=vector_store)# create your indexindex = VectorStoreIndex.from_documents(    documents, storage_context=storage_context)# create a query engine and queryquery_engine = index.as_query_engine()response = query_engine.query("什么是角色提示?")print(response)

2. 总结

本文我们学习了如何在 LlamaIndex 中使用自定义的向量数据库，并详细介绍了其实现步骤。再总结一下，在 LlamaIndex 中使用自定义的向量数据库，最主要的是创建 LlamaIndex 的 VectorStore，然后将 VectorStore 封装到 StorageContext 中，最后将 StorageContext 传入 VectorStoreIndex 的 from_documents 函数中。

3. 参考

· https://docs.llamaindex.ai/en/stable/understanding/storing/storing/

如果觉得本文对你有帮助，麻烦点个赞和关注呗 ~~~点击上方公众号，关注↑↑↑

· 大家好，我是同学小张，日常分享AI知识和实战案例

· 欢迎点赞 + 关注 👏，持续学习，持续干货输出。

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。

上一篇：年底以为能“躺赢”，没想到被甲流这货“偷袭”！
下一篇：中秋节婆婆要求我包吃包住, 我决定三十六计走为上策