Table of Contents
ð MongoDB Developer Day Belo Horizonte
Well, this article will be not so techinical, just some comments about the event I was invited to participate. As the title of the event already says, it's an entire day working with MongoDB, will be a mix of AI with MongoDB.
The good news is that was a hands-on experience. ðĨģ
ðĪŠ This "article" will not have a good structure, I think it's better to call my notes. Some free notes that I was taking during the presentations.
ð Vector Search: Beginner to Pro
What is vector search? It's used to do a semantic search, this means that I will search by context. Meaning that will not only get data from the text itself.
The difereces between searchs:
| Lexical search | Vector search |
|---|---|
| Keyword search | Semantic similarities |
| You want to corpus closely matches how users search | "Vocabulary gap" between corpus and how users search |
| First pass at text-based relevancy | Text, image, audio, video search |
ðĄ Embeddings: numeric, mult-dimensional representation of a piece of information.
The data is embedded following a embedding model (machine learning model) to generate a vector of this information.
We will use at this event the Voyage AI embedding tool. It's a tool owned by MongoDB.
You need to choose the calculation to make the search. Some of them are: euclidean distance between the vertices of the vector, dot product vector multiplication as a measure of alignment, or cosine similarity. The cosine similarity is the most used by vector search, because it's measuring the angle between vectors.
One way to improve performance is to make a pre-filtering technique. This way it will use a small dataset to do a vector search. There is another way, the post-filtering, but it's not a good technique.
It is also possible to shrink the size of the embedding data on MongoDB, you can choose the quantization. The scalar quantization will change from float32 to int8. Binary quantization will decrease even more, making possible to use a single bit.
Links for vector search:
- How to Choose the Best Embedding Model for Your LLM Application
- HSNW Deepdive
- Asymmetric retrieval using Voyage 4
- Quantization
ð― Building RAG Applications with MongoDB
Retrieval-augemented generation (RAG) is a technique to enhance the quality of pre-trained LLM generation using information retrieved from external sources. It's a way to use AI but using a knowledge base.
The current top tier is AI agents, it has the knowledge base, but also has access to tools. The agent can execute some actions, for exemple: call a ls on my terminal.
Simples tasks are not required to have RAG, most of pre-trained models can handle those kinds of tasks. Making RAG being useful, when it's most specific tasks where I need to understand better the data.
Chunking: process of breaking down large pieces of text into smaller segments or chunks.
Chunking strategies:
- Fixed tokens: it's limited by a number;
- also possible to use with overlap
- Recursive with overlap: because it considers also paragraphs, punctuations, etc;
- Semantic chunk: get all string to make sure it make senses (is not the most used).
Contextualized embeddings create a relationship between the chunks. Making possible to the LLM to get all context.
âžïļ Always, always check if you are blocking anything that is not prepared to answer. Keep a guardrail so the anwser will not be of the track.
To continue building a RAG application, it's important to persist this context. You need to understand which level of persistence you will need.
Links:
The A to Z of Building AI Agents
The idea is to reconstruct the MongoDB chatbot. It was some features: anwser questions based on the knowledge base and summarize the content.
First thing first, what is an AI agent? An AI agent is a system that uses LLM to: reason through a problem, create a plan to solve the problem, execute and iterate on the plan with the help of a set of tools. The agent chooses which tool use, when call, and how to call.
Components of an agent:
- Perception: collect informations about the environment, as an input;
- Plan and reason: how to solve a problem and elaborate a plan to solve it, the LLM, also knew as an agent brain;
- Plan without feedback (Chain-of-Thought): LLM does not modify your execution plan;
- Plan with feedback: the LLM will make some adjustments on the execution plan;
- ReAct (Reason + Act)
- Reflection
- Tools (actions): external interfaces to act and solve a problem, examples: getting info on wikipedia, database vector search;
- Memory: help the agent to understand and learn with the past.
- Short term: specific chat;
- Long term: about multiple chats.
LangGraph was be the library used to orchestrate the actions between the agents. Graphs has only three components:
- Nodes (logic): two nodes, one of the agent and another for the tools
- Edges: define the workflow
- Fixed:
- Conditional: like a decision tree;
- Non-conditional:
- Loops: return to a specific node using a custom logic for that.
- State: data structure shared between the nodes, to have relevant data to execute the task.
[START] -> [Agent Node] -> (conditional) -> [Tool Node] -> [Agent Node] -> ... -> [END]
Links:
That's all folks! Just to keep one more link here, you can check the MongoDB AI Learning HUB.