Agentic generative AI assistants signify a big development in synthetic intelligence, that includes dynamic methods powered by massive language fashions (LLMs) that interact in open-ended dialogue and sort out advanced duties. Not like fundamental chatbots, these implementations possess broad intelligence, sustaining multi-step conversations whereas adapting to consumer wants and executing vital backend duties.
These methods retrieve business-specific knowledge in real-time via API calls and database lookups, incorporating this data into LLM-generated responses or offering it alongside them utilizing predefined requirements. This mix of LLM capabilities with dynamic knowledge retrieval is named Retrieval-Augmented Technology (RAG).
For instance, an agentic assistant dealing with lodge reserving would first question a database to search out properties that match the visitor’s particular necessities. The assistant would then make API calls to retrieve real-time details about room availability and present charges. This retrieved knowledge may be dealt with in two methods: both the LLM can course of it to generate a complete response, or it may be displayed alongside an LLM-generated abstract. Each approaches permit friends obtain exact, present data that’s built-in into their ongoing dialog with the assistant.
On this put up, we present how one can implement a generative AI agentic assistant that makes use of each semantic and text-based search utilizing Amazon Bedrock, Amazon Bedrock AgentCore, Strands Brokers and Amazon OpenSearch.
Info retrieval approaches in RAG methods
Usually talking, data retrieval supporting RAG capabilities in agentic generative AI implementations revolves round real-time querying of the backend knowledge sources or speaking with an API. The responses are then factored into the following steps carried out by the implementation. From a high-level system design and implementation perspective, this step is just not particular to generative AI-based options: Databases, APIs, and methods counting on integration with them have been round for a very long time. There are particular data retrieval approaches which have emerged alongside agentic AI implementations, most notably, semantic search-based knowledge lookups. They retrieve knowledge based mostly on the which means of the search phrase versus key phrase or sample lexical similarity. Vector embeddings are precomputed and saved in vector databases, enabling environment friendly similarity calculations at question time. The core precept of Vector Similarity Search (VSS) entails discovering the closest matches between these numerical representations utilizing mathematical distance metrics equivalent to cosine similarity or Euclidean distance. These mathematical capabilities are significantly environment friendly when looking via massive corpora of information as a result of the vector representations are precomputed. Bi-encoder fashions are generally used on this course of. They individually encode the question and paperwork into vectors, enabling environment friendly similarity comparisons at scale with out requiring the mannequin to course of query-document pairs collectively. When a consumer submits a question, the system converts it right into a vector and searches for content material vectors positioned closest to it within the high-dimensional house. Which means even when precise key phrases don’t match, the search can discover related outcomes based mostly on conceptual semantic similarity. Furthermore, in conditions the place search phrases are lexically however not semantically near entries within the dataset, semantic similarity search will “choose” semantically related entries.
For instance, given the vectorized dataset: [“building materials”, “plumbing supplies”, “2×2 multiplication result”], the search string “2×4 lumber board” will most definitely produce “constructing supplies” as the highest matching candidate. Combining semantic search with LLM-driven brokers helps pure language alignment throughout the user-facing and backend knowledge retrieval parts of the answer. LLMs course of pure language Enter supplied by the consumer whereas semantic search capabilities permit for knowledge retrieval based mostly on the pure language Enter formulated by LLMs relying on the top consumer – agent communication cadence.
The problem: When semantic search alone isn’t sufficient
Take into account a real-world situation: A buyer is looking for a lodge property and needs to search out “a luxurious lodge with ocean views in Miami, Florida.” Semantic search excels at understanding ideas like “luxurious” and “ocean views,” it might wrestle with exact location matching. The search would possibly return extremely related luxurious oceanfront properties based mostly on semantic similarity, however these could possibly be in California, the Caribbean, or wherever else with ocean entry, not particularly in Miami as requested. This limitation arises as a result of semantic search prioritizes conceptual similarity over precise attribute matching. In instances the place customers want each semantic understanding (luxurious, ocean views) and exact filtering (Miami, Florida), relying solely on semantic search produces suboptimal outcomes. That is the place hybrid search turns into important. It combines the semantic understanding of pure language descriptions with the precision of text-based filtering on structured attributes like location, dates, or particular metadata. To handle this, we introduce a hybrid search method that performs each:
- Semantic search to grasp pure language descriptions and discover semantically related content material
- Textual content-based search to facilitate exact matching on structured attributes like areas, dates, or identifiers
When a consumer supplies a key term, an LLM first analyzes the question to establish particular attributes (equivalent to location) and maps them to searchable values (for instance, “Northern Michigan” → “MI”). These extracted attributes are then used as filters along side semantic similarity scoring, ensuring that outcomes are each conceptually related and exactly matched to the consumer’s necessities. The next tables present a simplified view of the semantic search stream with clear textual content lodge descriptions supplied for context:
Vector retailer knowledge:
hotel-1
Description: The Artisan Loft lodge anchors the nook of Inexperienced and Randolph Streets in Large Metropolis’s bustling Southwest Loop, occupying a thoughtfully renovated Twenties brick warehouse that celebrates the neighborhood’s industrial heritage. Friends discover themselves mere steps from the famed Restaurant Row, with acclaimed eating spots and stylish boutiques dotting the encircling blocks.
Description Vector: […]
Location: Large Metropolis, USA
hotel-2
Description: Perched on a rugged cliff overlooking the dramatic shoreline of Large Sur, The Cypress Haven emerges from the panorama as if it have been carved from the earth itself. This intimate 42-room sanctuary seamlessly integrates into its environment with residing roof gardens, floor-to-ceiling home windows, and pure supplies together with native stone and reclaimed redwood. Every spacious suite includes a non-public terrace suspended over the Pacific, the place friends can spot migrating whales whereas soaking in Japanese cedar ofuro tubs.
Description Vector: […]
Location: Seashore Metropolis, USA
hotel-3
Description: Nestled in a centuries-old maple forest simply outdoors the Berkshires, Woodland Haven Lodge presents an intimate escape the place luxurious meets aware simplicity. This transformed Nineteenth-century property options 28 thoughtfully appointed rooms unfold throughout the primary home and 4 separate cottages, every with wraparound porches and floor-to-ceiling home windows that body the encircling woodlands.
Description Vector: […]
Location: Quiet Metropolis, USA
hotel-4
Description: Nestled within the coronary heart of Central Metropolis’s bustling downtown district, the Skyline Oasis lodge stands as a beacon of luxurious and modernity. This 45-story glass and metal tower presents breathtaking panoramic views of town’s iconic skyline and the close by Central River. With 500 elegantly appointed rooms and suites, the Skyline Oasis caters to each enterprise vacationers and vacationers looking for a premium city expertise. The lodge boasts a rooftop infinity pool, a Michelin-starred restaurant, and a state-of-the-art health heart. Its prime location places friends inside strolling distance of Central Metropolis’s main points of interest, together with the Museum of Fashionable Artwork, the Central Metropolis Opera Home, and the colourful Riverfront District.
Description Vector: […]
Location: Central Metropolis, USA
Search Phrase
On the lookout for a lodge by the ocean
Search Outcomes
hotel-2
Search instance:
- Search phrase: “On the lookout for a lodge by the ocean”
- Semantic search consequence: hotel-2 (The Cypress Haven)
Hybrid search instance:
- Search phrase: “On the lookout for a lodge with a pleasant restaurant in downtown Central Metropolis”
- Hybrid search consequence: hotel-4 (finest match contemplating each semantic relevance and exact location)
For extra particulars on hybrid search implementations, check with the Amazon Bedrock Information Bases hybrid search weblog put up.
Introducing an agent-based answer
Take into account a lodge search situation the place customers have various wants. One consumer would possibly ask “discover me a comfortable lodge,” requiring semantic understanding of “cozy.” One other would possibly request “discover lodges in Miami,” needing exact location filtering. A 3rd would possibly need “a luxurious beachfront lodge in Miami,” requiring each approaches concurrently. Conventional RAG implementations with mounted workflows can not adapt dynamically to those various necessities. Our situation calls for customized search logic that may mix a number of knowledge sources and dynamically adapt retrieval methods based mostly on question traits. An agent-based method supplies this flexibility. The LLM itself determines the optimum search technique by analyzing every question and deciding on the suitable instruments.
Why brokers?
Agent-based methods provide superior adaptability as a result of the LLM determines the sequence of actions wanted to unravel issues, enabling dynamic resolution routing, clever device choice, and high quality management via self-evaluation. The next sections present how one can implement a generative AI agentic assistant that makes use of each semantic and text-based search utilizing Amazon Bedrock, Amazon Bedrock AgentCore, Strands Brokers and Amazon OpenSearch.
Structure overview
Determine 1 exhibits a contemporary, serverless structure that you should utilize for an clever search assistant. It combines the inspiration fashions in Amazon Bedrock, Amazon Bedrock AgentCore (for agent orchestration), and Amazon OpenSearch Serverless (for hybrid search capabilities).
Consumer interplay layer
Consumer functions work together with the system via Amazon API Gateway, which supplies a safe, scalable entry level for consumer requests. When a consumer asks a query like “Discover me a beachfront lodge in Northern Michigan,” the request flows via API Gateway to Amazon Bedrock AgentCore.
Agent orchestration with Amazon Bedrock AgentCore
Amazon Bedrock AgentCore serves because the orchestration engine, managing the whole agent lifecycle and coordinating interactions between the consumer, the LLM, and obtainable instruments. AgentCore implements the agentic loop—a steady cycle of reasoning, motion, and commentary—the place the agent:
- Analyzes the consumer’s question utilizing Bedrock’s basis fashions
- Decides which instruments to invoke based mostly on the question necessities
- Executes the suitable hybrid search device with extracted parameters
- Evaluates the outcomes and determines if further actions are wanted
- Responds to the consumer with synthesized data
All through this course of, Amazon Bedrock Guardrails implement content material security and coverage adherence, sustaining applicable responses.
Hybrid search with OpenSearch Serverless
The structure integrates Amazon OpenSearch Serverless because the vector retailer and search engine. OpenSearch shops each vectorized embeddings (for semantic understanding) and structured textual content fields (for exact filtering). This method supporting our hybrid search method. When the agent invokes the hybrid search device, OpenSearch executes queries that mix:
- Semantic matching utilizing vector similarity for conceptual understanding
- Textual content-based filtering for exact constraints like location or facilities
Monitoring and safety
The structure contains Amazon CloudWatch for monitoring system efficiency and utilization patterns. AWS IAM manages entry management and safety insurance policies throughout parts.
Why this structure?
This serverless design supplies a number of key benefits:
- Low-latency responses for real-time conversational interactions
- Auto-scaling to deal with various workloads with out guide intervention
- Value-effectiveness via pay-as-you-go pricing with no idle infrastructure
- Manufacturing-ready with built-in monitoring, logging, and security measures
The mix of the AgentCore orchestration capabilities with hybrid search performance of OpenSearch permits our assistant to dynamically adapt its search technique based mostly on consumer intent, one thing that inflexible RAG pipelines can not obtain.
Determine 1
Determine Notice: The code samples and structure artifacts supplied on this doc are supposed for demonstration and reference functions solely and usually are not production-ready.
Implementation with Strands and Amazon Bedrock AgentCore
To construct our hybrid search agent, we use Strands, an open-source AI agent framework that simplifies creating LLM-powered functions with tool-calling capabilities. Strands permit us to outline our hybrid search operate as a “device” that the agent can intelligently invoke based mostly on consumer queries. For complete particulars on Strands structure and patterns, see the Strands documentation.
Right here’s how we outline our hybrid search device:
from strands import device
@device
def hybrid_search(query_text: str, nation: str = None, metropolis: str = None):
“””
Performs hybrid search combining semantic understanding with location filtering.
The agent calls this when customers present each descriptive preferences and site.
Args:
query_text: Pure language description of what to seek for
nation: Optionally available nation filter
metropolis: Optionally available metropolis filter
“””
# Generate embeddings for semantic search
vector = generate_embeddings(query_text)
# Construct hybrid question combining vector similarity and textual content filters
question = {
“bool”: {
“should”: [
{“knn”: {“embedding_field”: {“vector”: vector, “k”: 10}}}
],
“filter”: []
}
}
# Add location filters if supplied
if nation:
question[“bool”][“filter”].append({“time period”: {“nation”: nation}})
if metropolis:
question[“bool”][“filter”].append({“time period”: {“metropolis”: metropolis}})
# Execute search in OpenSearch
response = opensearch_client.search(index=”lodges”, physique=question)
return format_results(response)
As soon as we’ve outlined our instruments, we combine them with Amazon Bedrock AgentCore for deployment and runtime orchestration. Amazon Bedrock AgentCore allows you to deploy and function extremely efficient brokers securely at scale utilizing any framework and mannequin. It supplies purpose-built infrastructure to securely scale brokers and controls to function reliable brokers.
For detailed details about integrating Strands with Amazon Bedrock AgentCore, see the AgentCore-Strands integration tutorial.
Hybrid search implementation deep dive
A key differentiator of our AI assistant answer is its superior hybrid search functionality. Whereas many RAG implementations rely solely on semantic search, our structure extends past this. We’ve used the complete potential of OpenSearch, enabling semantic, text-based, and hybrid searches, all inside a single, environment friendly question. The next sections discover the technical particulars of this implementation.
The 2-pronged implementation
Our hybrid search implementation is constructed on two elementary parts: optimized knowledge storage and versatile question dealing with.
1. Optimized knowledge storage
The method to knowledge storage is necessary for environment friendly hybrid search.
- Knowledge categorization: We systematically categorize our knowledge into two predominant sorts:
- Semantic search candidates: This contains detailed descriptions, contexts, and explanations – content material that advantages from understanding which means past key phrases.
- Textual content search candidates: This encompasses metadata, product identifiers, dates, and different structured fields.
- Vector embedding: For our semantic knowledge, we use AWS Bedrock’s embedding fashions. These rework textual content into high-dimensional vectors that seize semantic which means successfully.
- Textual content knowledge optimization: Textual content knowledge is saved in its authentic format, optimized for fast conventional queries.
- Unified index construction: Our OpenSearch index is designed to accommodate each vector embeddings and textual content fields concurrently, enabling versatile querying capabilities.
2. Versatile search performance
Constructing on our optimized knowledge storage, we’ve developed a complete search operate that our AI agent can make the most of successfully:
- Adaptive search sorts: Our search operate is designed to carry out semantic, textual content, or hybrid searches as required by the agent.
- Semantic search implementation: For meaning-focused queries, we generate question embeddings utilizing Amazon Bedrock and carry out a k-NN (k-Nearest Neighbors) search within the vector house.
- Textual content search capabilities: When exact matching is important, we use OpenSearch’s strong textual content question functionalities, together with precise and fuzzy matching choices.
- Hybrid search execution: That is the place we mix vector similarity with textual content matching in a unified question. Utilizing OpenSearch’s bool question, we will modify the steadiness between semantic and textual content relevance as wanted.
- Outcome integration: Whatever the search sort, our system consolidates and ranks outcomes based mostly on general relevance, combining semantic understanding with exact textual content matching.
Reference pseudo code for hybrid search implementation:
def hybrid_search(query_text, nation, metropolis, search_type=”hybrid”):
“””
Hybrid search combining semantic and text-based search with location filtering
“””
# 1. Generate embeddings for semantic search
if search_type in [“semantic”, “hybrid”]:
vector = generate_embeddings(query_text)
# 2. Construct search question based mostly on sort
if search_type == “semantic”:
question = build_semantic_query(vector)
elif search_type == “textual content”:
question = build_text_query(nation, metropolis)
else: # hybrid search
question = build_hybrid_query(vector, nation, metropolis)
# 3. Execute search
response = search_opensearch(question)
# 4. Course of and return outcomes
return format_results(response)
# Instance utilization:
outcomes = hybrid_search(
query_text=”luxurious lodge”,
nation=”USA”,
metropolis=”Miami”
)
OpenSearch helps a number of question sorts together with text-based search, vector search (knn), and hybrid approaches that mix each strategies. For detailed details about obtainable question sorts and their implementations, check with the OpenSearch question documentation.
Significance of the hybrid method
The hybrid method considerably enhances our AI assistant’s capabilities:
- It helps extremely correct data retrieval, contemplating each context and content material.
- It adapts to varied question sorts, sustaining constant efficiency.
- It supplies extra related and complete responses to consumer inquiries.
Within the area of AI-powered search, our hybrid method represents a big development. It presents a degree of flexibility and accuracy that considerably improves our assistant’s capacity to retrieve and course of data successfully.
Actual-life use instances
A number of the use instances the place hybrid search may be relevant embrace issues like:
- Actual property and property: Property search combining life-style desire understanding (“family-friendly”) with precise location and amenity filtering.
- Authorized {and professional} companies: Case legislation analysis combining conceptual authorized similarity with exact jurisdiction and date filtering for complete authorized analysis.
- Healthcare and medical: Care groups ask “sufferers with persistent circumstances requiring related remedy protocols as John Doe” – combines semantic understanding of remedy complexity with precise medical file matching.
- Media and leisure: Content material discovery system combining precise style filtering with semantic plot understanding
- E-commerce and retail: Pure language product discovery with filter precision – “comfy winter footwear” finds semantic matches whereas making use of precise dimension or worth or model filters.
These use instances exhibit how hybrid search bridges the hole between pure language understanding and exact knowledge filtering, enabling extra intuitive and correct data retrieval.
Conclusion
The combination of Amazon Bedrock, Amazon Bedrock AgentCore, Strands Brokers, and Amazon OpenSearch Serverless represents a big development in constructing clever search functions that mix the facility of LLMs with refined data retrieval methods. This structure blends semantic, text-based, and hybrid search capabilities to ship extra correct and contextually related outcomes than conventional approaches. By implementing an agent-based system utilizing Amazon Bedrock AgentCore, state administration and Strands device abstractions, builders can create dynamic, conversational AI assistants that intelligently decide probably the most applicable search methods based mostly on consumer queries. The hybrid search method, which mixes vector similarity with exact textual content matching, presents flexibility and accuracy in data retrieval, enabling AI methods to higher perceive consumer intent and ship extra complete responses. As organizations proceed to construct AI options, this structure supplies a scalable, safe basis that makes use of the complete potential of AWS companies whereas sustaining the adaptability wanted for advanced, real-world functions.
In regards to the authors
Arpit Gupta
Arpit Gupta is a Knowledge Architect at AWS Skilled Companies with a concentrate on knowledge analytics. He focuses on creating knowledge lakes, analytics options, and generative AI functions within the cloud, serving to organizations rework their knowledge into actionable enterprise insights. His passions prolong from the digital to the bodily realm – from tennis courts to the kitchen and exploring new locations with household.
Ashish Bhagam
Ashish Bhagam is a Knowledge Architect with AWS Skilled Companies Analytics Observe. He helps prospects design and implement scalable knowledge options and modernize their knowledge architectures. Outdoors of labor, he enjoys watching cricket matches and spending high quality time together with his household.
Ross Gabay
Ross Gabay was a Principal Knowledge Architect in AWS Skilled Companies with a concentrate on Graph Databases and GenAI knowledge analytics. He focuses on creating Graph DB – centric and GenAI options.

