欧美free性护士vide0shd,老熟女,一区二区三区,久久久久夜夜夜精品国产,久久久久久综合网天天,欧美成人护士h版

目錄

基于 LlamaIndex、Claude-3.5 Sonnet 和 MongoDB,構(gòu)建具有超級檢索能力的 AI 智能體

想象一下,AI 助手可以無縫地與你互動,動態(tài)地根據(jù)你的需求檢索信息并完成任務(wù)。隨著智能體檢索增強生成(RAG)的興起,這一愿景正逐漸成為現(xiàn)實。

在本文中,我們將深入探討這個令人興奮的領(lǐng)域,探索如何利用強大的工具組合:LlamaIndex、Claude-3.5 Sonnet 和 MongoDB 來創(chuàng)建具有檢索超級能力的 AI 智能體。

工具集成

讓我們看看如何將這些強大的工具結(jié)合在一起:

LlamaIndex:這個先進的搜索引擎擅長基于意義而不是關(guān)鍵詞查找相似信息。它充當(dāng) AI 智能體的“眼睛”,在海量信息中定位最相關(guān)的數(shù)據(jù)。

Claude-3.5 Sonnet:它允許智能體處理 LlamaIndex 檢索到的信息,生成響應(yīng)。

MongoDB:一個強大的 NoSQL 數(shù)據(jù)庫,MongoDB 在存儲和管理支持 AI 智能體的知識庫中起著關(guān)鍵作用。其靈活性允許存儲各種數(shù)據(jù)類型,使其成為復(fù)雜信息檢索任務(wù)的理想選擇。

集成優(yōu)勢

這種協(xié)同作用帶來了許多好處:

增強的信息檢索:LlamaIndex 的向量搜索功能確保 AI 智能體檢索到最相關(guān)的信息,即使是細(xì)微的查詢也不例外。

動態(tài)任務(wù)完成:Claude-3.5 Sonnet 使 AI 智能體能夠分析檢索到的數(shù)據(jù)并采取適當(dāng)?shù)男袆?,使其真正成為能夠獨立行動的智能體。

可擴展性和靈活性:MongoDB 處理大型數(shù)據(jù)集的能力允許系統(tǒng)隨著信息需求的增加而增長。

代碼實現(xiàn)

讓我們深入研究使用 LlamaIndex、Claude-3.5 Sonnet 和 MongoDB 的智能體 RAG。

第一步:安裝庫

!pip install --quiet llama-index  # main llamaindex library

!pip install --quiet llama-index-vector-stores-MongoDB # mongodb vector database

!pip install --quiet llama-index-llms-anthropic # anthropic LLM provider

!pip install --quiet llama-index-embeddings-openai # openai embedding provider

!pip install --quiet pymongo pandas datasets # others

第二步:設(shè)置環(huán)境變量

import os

os.environ["ANTHROPIC_API_KEY"] = ""

os.environ["HF_TOKEN"] = ""

os.environ["OPENAI_API_KEY"] = ""

# WARNING: Never commit API keys or sensitive information to public repositories

LLM 和嵌入模型配置

from llama_index.embeddings.openai import OpenAIEmbedding

from llama_index.llms.anthropic import Anthropic

from llama_index.core import Settings

llm = Anthropic(model="claude-3-5-sonnet-20240620")

embed_model = OpenAIEmbedding(

    model="text-embedding-3-small", 

    dimensions=256,

    embed_batch_size=10, 

    openai_api_key=os.environ["OPENAI_API_KEY"]

)

Settings.embed_model = embed_model

Settings.llm = llm

第三步:數(shù)據(jù)加載和處理

from datasets import load_dataset

import pandas as pd

# https://huggingface.co/datasets/MongoDB/airbnb_embeddings

dataset = load_dataset("MongoDB/airbnb_embeddings", split="train", streaming=True)

dataset = dataset.take(4000)

# Convert the dataset to a pandas dataframe

dataset_df = pd.DataFrame(dataset)

# Dataset comes with embeddings created with OpenAI, but we will recreate new ones

dataset_df = dataset_df.drop(columns=['text_embeddings'])

dataset_df.head(5)

第四步:生成嵌入

import json

from llama_index.core import Document

from llama_index.core.schema import MetadataMode

documents_json = dataset_df.to_json(orient='records')

documents_list = json.loads(documents_json)

llama_documents = []

for document in documents_list:

    # Convert complex objects to JSON strings

    for field in ["amenities", "images", "host", "address", "availability", "review_scores", "reviews", "image_embeddings"]:

        document[field] = json.dumps(document[field])

    # Create a Document object

    llama_document = Document(

        text=document["description"],

        metadata=document,

        excluded_llm_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],

        excluded_embed_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],

        metadata_template="{key}=>{value}",

        text_template="Metadata: {metadata_str}\n-----\nContent: {content}",

    )

    llama_documents.append(llama_document)

# Observing input examples

print("\nThe LLM sees this: \n", llama_documents[0].get_content(metadata_mode=MetadataMode.LLM))

print("\nThe Embedding model sees this: \n", llama_documents[0].get_content(metadata_mode=MetadataMode.EMBED))

from llama_index.core.node_parser import SentenceSplitter, SemanticSplitterNodeParser

from llama_index.core.schema import MetadataMode

from tqdm import tqdm

# semantic_splitter = SemanticSplitterNodeParser(

#     buffer_size=10, breakpoint_percentile_threshold=95, embed_model=embed_model

# )

base_splitter = SentenceSplitter(chunk_size=5000, chunk_overlap=200)

nodes = base_splitter.get_nodes_from_documents(llama_documents)

# Progress bar

pbar = tqdm(total=len(nodes), desc="Embedding Progress", unit="node")

for node in nodes:

    node_embedding = embed_model.get_text_embedding(

        node.get_content(metadata_mode=MetadataMode.EMBED)

    )

    node.embedding = node_embedding

    

    # Update the progress bar

    pbar.update(1)

# Close the progress bar

pbar.close()

print("Embedding process completed!")

第五步:MongoDB 設(shè)置

import pymongo

os.environ["MONGO_URI"] = ""

def get_mongo_client(mongo_uri):

    """Establish and validate connection to the MongoDB."""

    

    client = pymongo.MongoClient(mongo_uri, appname="devrel.showcase.python")

    # Validate the connection

    ping_result = client.admin.command('ping')

    if ping_result.get('ok') == 1.0:

        # Connection successful

        print("Connection to MongoDB successful")

        return client

    else:

        print("Connection to MongoDB failed")

    return None

mongo_client = get_mongo_client(mongo_uri)

DB_NAME = "airbnb"

COLLECTION_NAME = "listings_reviews"

db = mongo_client.get_database(DB_NAME)

collection = db.get_collection(COLLECTION_NAME)

第六步:向量數(shù)據(jù)庫集成

from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

vector_store = MongoDBAtlasVectorSearch(

    mongo_client, 

    db_name=DB_NAME, 

    collection_name=COLLECTION_NAME, 

    index_name="vector_index"

)

vector_store.add(nodes)

第七步:創(chuàng)建檢索工具和智能體

from llama_index.core import VectorStoreIndex

from llama_index.core.tools import QueryEngineTool, ToolMetadata

from llama_index.core.agent import FunctionCallingAgentWorker

index = VectorStoreIndex.from_vector_store(vector_store)

query_engine = index.as_query_engine(similarity_top_k=5, llm=llm)

query_engine_tool = QueryEngineTool(

    query_engine=query_engine,

    metadata=ToolMetadata(

        name="knowledge_base",

        description=(

            "Provides information about Airbnb listings and reviews."

            "Use a detailed plain text question as input to the tool."

        ),

    ),

)

agent_worker = FunctionCallingAgentWorker.from_tools(

    [query_engine_tool], llm=llm, verbose=True

)

agent = agent_worker.as_agent()

response = agent.chat("Tell me the best listing for a place in New York")

print(str(response))

結(jié)論

LlamaIndex、Claude-3.5 Sonnet 和 MongoDB 共同構(gòu)建的智能體 RAG 未來充滿可能性。

本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點和立場。

轉(zhuǎn)載請注明,如有侵權(quán),聯(lián)系刪除。

本文鏈接:http://gantiao.com.cn/post/19143405.html

發(fā)布評論

您暫未設(shè)置收款碼

請在主題配置——文章設(shè)置里上傳

掃描二維碼手機訪問

文章目錄