Constructing a RAG system simply acquired a lot simpler. Google’s File Search device for the Gemini API now handles the heavy lifting of connecting LLMs to your information. Chunking, embedding, indexing are all managed for you. And with the most recent replace, it’s gone multimodal. Now you can search by way of each textual content and pictures in a single pipeline, with customized metadata filtering and page-level citations inbuilt. On this information, we’ll stroll by way of how File Search works and implement it with sensible examples.
What File Search Does?
File Search helps Gemini entry and use info out of your information sources like reviews, paperwork, analysis papers, code, and personal data bases.
While you add a file, Gemini breaks it into smaller items referred to as “chunks” and creates embeddings for them. These embeddings are numerical representations that seize the which means of the content material, serving to Gemini perceive the context. They’re then saved in a File Search Retailer for straightforward retrieval.
While you ask a query, Gemini searches the saved embeddings for probably the most related chunks and makes use of them as context to generate solutions. That is the essence of Retrieval Augmented Era (RAG).
Gemini File Search goes past simply textual content. It additionally helps multimodal RAG, permitting textual content and pictures to be listed and searched collectively. This implies you’ll be able to retrieve info from PDFs, photographs, charts, screenshots, and extra utilizing pure language queries.
For multimodal duties, Gemini makes use of gemini-embedding-2 for picture and multimodal embeddings, whereas gemini-embedding-001 handles textual content embeddings. Be aware that audio and video codecs usually are not supported but.
Additionally Learn: Constructing an LLM Mannequin utilizing Google Gemini API
How File Search Works?
File Search is powered by semantic vector search. As an alternative of matching on phrases straight, it should discover info based mostly on which means and context. Because of this File Search can discover you related info even when the wording of the question is totally different.
Time wanted: 4 minutes
Right here’s the way it works step-by-step:
- Add a file
The file will likely be damaged up into smaller sections known as “chunks.”
- Embedding era
Every chunk can be reworked right into a numerical vector that represents the which means of that chunk.
- Storage
The embeddings will likely be saved in a File Search Retailer, an embedded retailer designed particularly for retrieval.
- Question
When a consumer poses a query, File Search will rework that query into an embedding.
- Retrieval
The retrieval step will evaluate the query embedding with the saved embeddings and discover which chunks are most comparable (if any).
- Grounding
Related chunks are added to the immediate to the Gemini mannequin in order that the reply is grounded within the factual information from the paperwork.
This complete course of is dealt with below the Gemini API. The developer doesn’t should handle any further infrastructure or databases.
Setup Necessities
To make the most of the File Search Software, builders will want just a few basic parts. They might want to have Python 3.9 or newer, the google-genai consumer library, and a legitimate Gemini API key that has entry to both gemini-2.5-pro or gemini-2.5-flash.
Set up the consumer library by operating:
pip set up google-genai -U
Then, set your surroundings variable for the API key:
export GOOGLE_API_KEY=”your_api_key_here”
Making a File Search Retailer
A File Search Retailer is the place Gemini shops and indexes embeddings created out of your uploaded recordsdata. As soon as a file is uploaded and listed, the listed information stays out there for retrieval till you manually delete it.
For text-only RAG, you’ll be able to create a traditional File Search Retailer. For multimodal RAG, the place you need to add and search each paperwork and pictures, create the shop with fashions/gemini-embedding-2.
from google import genai
from google.genai import varieties
import time
import os
from pathlib import Path
# Don’t hardcode your API key within the pocket book.
# Set it as an surroundings variable as a substitute.
os.environ[“GOOGLE_API_KEY”] = “enter_your_api_key”
consumer = genai.Shopper(api_key=os.environ[“GOOGLE_API_KEY”])
file_search_store = consumer.file_search_stores.create(
config={
“display_name”: “my_multimodal_rag_store”,
“embedding_model”: “fashions/gemini-embedding-2”
}
)
print(“File Search Retailer created:”, file_search_store.identify)
Output:
This replace is essential as a result of the official docs present embedding_model: fashions/gemini-embedding-2 whereas making a File Search Retailer for multimodal utilization.
Add a File
After the File Search Retailer is created, you’ll be able to add recordsdata to it. When a file is uploaded, Gemini File Search mechanically chunks the content material, generates embeddings, and indexes it for quick retrieval.
For text-based RAG, File Search helps paperwork resembling PDF, DOCX, TXT, JSON, and programming recordsdata like .py and .js.
For multimodal RAG, File Search additionally helps picture recordsdata. This implies you’ll be able to add paperwork and pictures into the identical File Search Retailer and ask questions that require each textual and visible context. For instance, you’ll be able to add a analysis paper, a product picture, and a chart, then ask Gemini to summarize the paper and clarify the associated visible info.
For picture uploads, be certain that the File Search Retailer is created with fashions/gemini-embedding-2. In line with the official documentation, supported picture codecs are PNG and JPEG. Picture recordsdata should be at most 4K x 4K pixels, and a request can embody a most of 6 photographs.
Add a Doc File
# Add and import a doc into the File Search Retailer.
# The show identify will likely be seen in citations.
operation = consumer.file_search_stores.upload_to_file_search_store(
file=”/content material/Paper2Agent.pdf”,
file_search_store_name=file_search_store.identify,
config={
“display_name”: “Paper2Agent.pdf”,
}
)
# Wait till import is full
whereas not operation.performed:
time.sleep(5)
operation = consumer.operations.get(operation)
print(“Doc efficiently uploaded and listed.”)
Output:
After this step, the doc is chunked, embedded, listed, and prepared for retrieval.
Add an Picture File for Multimodal Retrieval
You can even add a picture file to the identical File Search Retailer. That is helpful when your utility must retrieve info from product photographs, screenshots, charts, diagrams, or different visible content material.
# Add a picture file for multimodal retrieval.
operation = consumer.file_search_stores.upload_to_file_search_store(
file=”/content material/product_image.jpg”,
file_search_store_name=file_search_store.identify,
config={
“display_name”: “product_image.jpg”,
}
)
# Wait till import is full
whereas not operation.performed:
time.sleep(5)
operation = consumer.operations.get(operation)
print(“Picture efficiently uploaded and listed.”
Output:
As soon as the picture is listed, Gemini can retrieve it throughout File Search when the consumer’s question is related to the picture.
Add A number of Paperwork and Photographs
In real-world purposes, you might need to add a number of recordsdata directly. These recordsdata can embody each textual content paperwork and pictures.
from pathlib import Path
import time
files_to_upload = [
“/content/Paper2Agent.pdf”,
“/content/product_image.jpg”,
“/content/sales_chart.png”
]
for file_path in files_to_upload:
operation = consumer.file_search_stores.upload_to_file_search_store(
file=file_path,
file_search_store_name=file_search_store.identify,
config={
“display_name”: Path(file_path).identify,
}
)
whereas not operation.performed:
time.sleep(5)
operation = consumer.operations.get(operation)
print(f”Uploaded and listed: {file_path}”)
Output:
After the add step, all recordsdata are chunked, embedded, listed, and prepared for retrieval. If the File Search Retailer accommodates each paperwork and pictures, Gemini can retrieve related context from each sources whereas answering consumer questions.
Ask Questions Concerning the File
As soon as your recordsdata are listed, Gemini can reply questions utilizing the uploaded paperwork and pictures as context. It searches the File Search Retailer, retrieves probably the most related chunks, and makes use of them to generate a grounded response.
For a text-only use case, you’ll be able to ask a query concerning the uploaded PDF:
response = consumer.fashions.generate_content(
mannequin=”gemini-3-flash-preview”,
contents=”Summarize what’s there within the analysis paper.”,
config=varieties.GenerateContentConfig(
instruments=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=[file_search_store.name]
)
)
]
)
)
print(“Mannequin Response:n”)
print(response.textual content)
Output:
Right here, File Search is being utilized as a device inside generate_content(). The mannequin first searches your saved embeddings, pulls probably the most related sections, after which generates a solution based mostly on that context.
For a multimodal use case, you’ll be able to ask a query that makes use of each the doc and the picture:
response = consumer.fashions.generate_content(
mannequin=”gemini-3-flash-preview”,
contents=”””
Primarily based on the uploaded analysis paper, and the photographs,
summarize the important thing thought from the paper and clarify what the photographs reveals.
“””,
config=varieties.GenerateContentConfig(
instruments=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=[file_search_store.name]
)
)
]
)
)
print(“Multimodal Response:n”)
print(response.textual content)
Output:
Right here, File Search is used as a device inside generate_content(). The mannequin searches the saved embeddings, retrieves probably the most related textual content or picture context, after which generates a solution based mostly on that retrieved info.
Customise Chunking
By default, File Search decides the right way to break up recordsdata into chunks, however you’ll be able to management this conduct for higher search precision.
operation = consumer.file_search_stores.upload_to_file_search_store(
file_search_store_name=file_search_store.identify,
file=”path/to/your/file.txt”,
config={
‘chunking_config’: {
‘white_space_config’: {
‘max_tokens_per_chunk’: 200,
‘max_overlap_tokens’: 20
}
}
}
)
This configuration units every chunk to 200 tokens with 20 overlapping tokens for smoother context continuity. Shorter chunks give finer search outcomes, whereas bigger ones retain extra total which means helpful for analysis papers and code recordsdata.
Present Citations for Retrieved Context
You can even print quotation info to test which recordsdata or chunks Gemini used whereas producing the response. The official docs say quotation info is offered by way of grounding_metadata, and picture references could embody media quotation particulars.
grounding_metadata = response.candidates[0].grounding_metadata
print(“nRetrieved Context:n”)
if grounding_metadata and grounding_metadata.grounding_chunks:
for chunk in grounding_metadata.grounding_chunks:
context = chunk.retrieved_context
if context:
print(“Supply:”, getattr(context, “title”, “Unknown”))
print(“Textual content:”, getattr(context, “textual content”, “No textual content out there”))
if getattr(context, “page_number”, None):
print(“Web page Quantity:”, context.page_number)
if getattr(context, “media_id”, None):
print(“Media ID:”, context.media_id)
print(“-” * 50)
else:
print(“No grounding metadata discovered.”)
Output:
This makes the hands-on part stronger as a result of readers can see not solely the reply, but in addition the supply context utilized by Gemini.
Handle Your File Search Shops
You possibly can simply record, view, and delete file search shops utilizing the API.
print(“n Out there File Search Shops:”)
for s in consumer.file_search_stores.record():
print(” -“, s.identify)
# Get detailed information
particulars = consumer.file_search_stores.get(identify=file_search_store.identify)
print(“n Retailer Particulars:n”, particulars
# Delete the shop (optionally available cleanup)
consumer.file_search_stores.delete(identify=file_search_store.identify, config={‘pressure’: True})
print(“File Search Retailer deleted.”)
These administration choices assist hold your surroundings organized. Listed information stays saved till manually deleted, whereas recordsdata uploaded by way of the momentary Recordsdata API are mechanically eliminated after 48 hours.
Additionally Learn: 12 Issues You Can Do with the Free Gemini API
File Search Assist and Limits
File Search is offered with the next Gemini fashions: Gemini 3.1 Professional Preview, Gemini 3.1 Flash-Lite Preview, Gemini 3 Flash Preview, Gemini 2.5 Professional, and Gemini 2.5 Flash-Lite.
Gemini 3 fashions can help you mix File Search with customized instruments through operate calling. Nonetheless, File Search just isn’t but supported within the Reside API and can’t be used with sure built-in instruments like Grounding with Google Search or URL Context.
File Search helps a variety of file codecs, together with PDFs, Phrase paperwork, spreadsheets, shows, JSON, CSV, HTML, XML, Markdown, YAML, code recordsdata, ZIP recordsdata, and Jupyter notebooks. For multimodal RAG, it additionally helps PNG and JPEG photographs when the shop is created with fashions/gemini-embedding-2.
File Dimension and Storage Limits
Consumer Tier
File Dimension Restrict
Retailer Capability Restrict
Free
100 MB per file
1 GB
Tier 1
100 MB per file
10 GB
Tier 2
100 MB per file
100 GB
Tier 3
100 MB per file
1 TB
Beneficial: Hold every retailer below 20 GB for higher retrieval efficiency and decrease latency.
Relating to pricing, embeddings are charged at indexing time. Storage and query-time embeddings are free, and retrieved doc tokens are billed as regular context tokens.
Additionally Learn: Methods to Entry and Use the Gemini API?
Conclusion
File Search takes the infrastructure work out of constructing RAG techniques. No exterior vector databases, no customized embedding pipelines. Simply add your recordsdata and begin querying. With the brand new multimodal assist, now you can search throughout paperwork and pictures collectively. Metadata filtering helps you scope outcomes to precisely what’s related, and page-level citations make each reply traceable again to its supply. Whether or not you’re prototyping or constructing for manufacturing, File Search provides you a strong, managed basis to construct on. Get began at Google AI Studio or by way of the Gemini API docs linked within the article.
Hello, I’m Janvi, a passionate information science fanatic at the moment working at Analytics Vidhya. My journey into the world of information started with a deep curiosity about how we will extract significant insights from complicated datasets.
Login to proceed studying and luxuriate in expert-curated content material.
Hold Studying for Free

