Think about asking your AI mannequin, “What’s the climate in Tokyo proper now?” and as a substitute of hallucinating a solution, it calls your precise Python perform, fetches stay information, and responds accurately. That’s how empowering the software name capabilities within the Gemma 4 from Google are. A really thrilling addition to open-weight AI: this perform calling is structured, dependable, and constructed straight into the AI mannequin!
Coupled with Ollama for native referencing, it lets you develop non-cloud-dependent AI brokers. One of the best half – these brokers have entry to real-world APIs and providers domestically, with none subscription. On this information, we are going to cowl the idea and implementation structure in addition to three duties that you may experiment with instantly.
Additionally learn: Operating Claude Code for Free with Gemma 4 and Ollama
Conversational language fashions have a restricted information based mostly on once they have been developed. Therefore, they’ll provide solely an approximate reply if you ask for present market costs or present climate situations. This lack was addressed by offering an API wrapper round widespread fashions (capabilities). The goal – to resolve a lot of these questions by way of (tool-calling) service(s).
By enabling tool-calling, the mannequin can acknowledge:
- When it’s essential to retrieve outdoors info
- Establish the proper perform based mostly on the offered API
- Compile accurately formatted methodology calls (with arguments)
It then waits till the execution of that code block returns the output. It then composes an assessed reply based mostly on the obtained output.
To make clear: the mannequin by no means executes the tactic calls which were created by the consumer. It solely determines which strategies to name and how you can construction the tactic name argument listing. The consumer’s code will execute the strategies that they known as by way of the API perform. On this state of affairs, the mannequin represents the mind of a human, whereas the capabilities being known as symbolize the arms.
Earlier than you start writing code, it’s useful to grasp how every little thing works. Right here is the loop that every software in Gemma 4 will observe, because it makes software calls:
- Outline capabilities in Python to carry out precise duties (i.e., retrieve climate information from an exterior supply, question a database, convert cash from one foreign money to a different).
- Create a JSON schema for every of the capabilities you may have created. The schema ought to comprise the title of the perform and what its parameters are (together with their varieties).
- When the system sends a message to you, you ship each the tool-schemas you may have created and the system’s message to the Ollama API.
- The Ollama API returns information in a tool_calls block fairly than plain textual content.
- You execute the perform utilizing the parameters despatched to you by the Ollama API.
- You come back the consequence again to the Ollama API as a ‘function’:’software’ response.
- The Ollama API receives the consequence and returns the reply to you in pure language.
This two-pass sample is the inspiration for each function-calling AI agent, together with the examples proven under.
To execute these duties, you have to two parts: Ollama should be put in domestically in your machine, and you have to to obtain the Gemma 4 Edge 2B mannequin. There aren’t any dependencies past what is supplied with the usual set up of Python, so that you don’t want to fret about putting in Pip packages in any respect.
1. To put in Ollama with Homebrew or MacOS:
# Set up Ollama (macOS/Linux)
curl –fail -fsSL https://ollama.com/set up.sh | sh
2. To obtain the mannequin (which is roughly 2.5 GB):
# Obtain the Gemma 4 Edge Mannequin – E2B
ollama pull gemma4:e2b
After downloading the mannequin, use the Ollama listing to verify it exists within the listing of fashions. Now you can connect with the operating API on the URL http://localhost:11434 and run requests towards it utilizing the helper perform we are going to create:
import json, urllib.request, urllib.parse
def call_ollama(payload: dict) -> dict:
information = json.dumps(payload).encode(“utf-8”)
req = urllib.request.Request(
“http://localhost:11434/api/chat”,
information=information,
headers={“Content material-Kind”: “software/json”},
)
with urllib.request.urlopen(req) as resp:
return json.masses(resp.learn().decode(“utf-8”))
No third-party libraries are wanted; subsequently, the agent can run independently and offers full transparency.
Additionally learn: Methods to Run Gemma 4 on Your Telephone: A Fingers-On Information
Fingers-on Activity 01: Dwell Climate Lookup
The primary of our strategies makes use of open-meteo that pulls stay information for any location via a free climate API that doesn’t want a key in an effort to pull the knowledge right down to the native space based mostly on longitude/latitude coordinates. Should you’re going to make use of this API, you’ll have to carry out a sequence of steps :
1. Write your perform in Python
def get_current_weather(metropolis: str, unit: str = “celsius”) -> str:
geo_url = f”https://geocoding-api.open-meteo.com/v1/search?title={urllib.parse.quote(metropolis)}&rely=1″
with urllib.request.urlopen(geo_url) as r:
geo = json.masses(r.learn())
loc = geo[“results”][0]
lat, lon = loc[“latitude”], loc[“longitude”]
url = (f”https://api.open-meteo.com/v1/forecast”
f”?latitude={lat}&longitude={lon}”
f”&present=temperature_2m,wind_speed_10m”
f”&temperature_unit={unit}”)
with urllib.request.urlopen(url) as r:
information = json.masses(r.learn())
c = information[“current”]
return f”{metropolis}: {c[‘temperature_2m’]}°, wind {c[‘wind_speed_10m’]} km/h”
2. Outline your JSON schema
This offers the knowledge to the mannequin in order that Gemma 4 is aware of precisely what the perform might be doing/anticipating when it’s known as.
weather_tool = {
“sort”: “perform”,
“perform”: {
“title”: “get_current_weather”,
“description”: “Get stay temperature and wind pace for a metropolis.”,
“parameters”: {
“sort”: “object”,
“properties”: {
“metropolis”: {“sort”: “string”, “description”: “Metropolis title, e.g. Mumbai”},
“unit”: {“sort”: “string”, “enum”: [“celsius”, “fahrenheit”]}
},
“required”: [“city”]
}
}
3. Create a question in your software name (in addition to deal with and course of the response again)
messages = [{“role”: “user”, “content”: “What’s the weather in Mumbai right now?”}] response = call_ollama({“mannequin”: “gemma4:e2b”, “messages”: messages, “instruments”: [weather_tool], “stream”: False}) msg = response[“message”]
if “tool_calls” in msg: tc = msg[“tool_calls”][0] fn = tc[“function”][“name”] args = tc[“function”][“arguments”] consequence = get_current_weather(**args) # executed domestically
messages.append(msg)
messages.append({“function”: “software”, “content material”: consequence, “title”: fn})
ultimate = call_ollama({“mannequin”: “gemma4:e2b”, “messages”: messages, “instruments”: [weather_tool], “stream”: False})
print(ultimate[“message”][“content”])
Output
Fingers-on Activity 02: Dwell Forex Converter
The traditional LLM fails by hallucinating foreign money values and never having the ability to present correct, up-to-date foreign money conversion. With the assistance of ExchangeRate-API, the converter can get the most recent international trade charges and convert precisely between two currencies.
When you full Steps 1-3 under, you should have a completely functioning converter in Gemma 4:
1. Write your Python perform
def convert_currency(quantity: float, from_curr: str, to_curr: str) -> str:
url = f”https://open.er-api.com/v6/newest/{from_curr.higher()}”
with urllib.request.urlopen(url) as r:
information = json.masses(r.learn())
charge = information[“rates”].get(to_curr.higher())
if not charge:
return f”Forex {to_curr} not discovered.”
transformed = spherical(quantity * charge, 2)
return f”{quantity} {from_curr.higher()} = {transformed} {to_curr.higher()} (charge: {charge})”
2. Outline your JSON schema
currency_tool = {
“sort”: “perform”,
“perform”: {
“title”: “convert_currency”,
“description”: “Convert an quantity between two currencies at stay charges.”,
“parameters”: {
“sort”: “object”,
“properties”: {
“quantity”: {“sort”: “quantity”, “description”: “Quantity to transform”},
“from_curr”: {“sort”: “string”, “description”: “Supply foreign money, e.g. USD”},
“to_curr”: {“sort”: “string”, “description”: “Goal foreign money, e.g. EUR”}
},
“required”: [“amount”, “from_curr”, “to_curr”]
}
}
}
3. Check your answer utilizing a pure language question
response = call_ollama({
“mannequin”: “gemma4:e2b”,
“messages”: [{“role”: “user”, “content”: “How much is 5000 INR in USD today?”}],
“instruments”: [currency_tool],
“stream”: False
})
Gemma 4 will course of the pure language question and format a correct API name based mostly on quantity = 5000, from = ‘INR’, to = ‘USD’. The ensuing API name will then be processed by the identical ‘Suggestions’ methodology described in Activity 01.
Output
Gemma 4 excels at this process. You may provide the mannequin a number of instruments concurrently and submit a compound question. The mannequin coordinates all of the required calls in a single go; handbook chaining is pointless.
1. Add the timezone software
def get_current_time(metropolis: str) -> str:
url = f”https://timeapi.io/api/Time/present/zone?timeZone=Asia/{metropolis}”
with urllib.request.urlopen(url) as r:
information = json.masses(r.learn())
return f”Present time in {metropolis}: {information[‘time’]}, {information[‘dayOfWeek’]} {information[‘date’]}”
time_tool = {
“sort”: “perform”,
“perform”: {
“title”: “get_current_time”,
“description”: “Get the present native time in a metropolis.”,
“parameters”: {
“sort”: “object”,
“properties”: {
“metropolis”: {“sort”: “string”, “description”: “Metropolis title for timezone, e.g. Tokyo”}
},
“required”: [“city”]
}
}
2. Construct the multi-tool agent loop
TOOL_FUNCTIONS = { “get_current_weather”: get_current_weather, “convert_currency”: convert_currency, “get_current_time”: get_current_time, }
def run_agent(user_query: str): all_tools = [weather_tool, currency_tool, time_tool] messages = [{“role”: “user”, “content”: user_query}]
response = call_ollama({“mannequin”: “gemma4:e2b”, “messages”: messages, “instruments”: all_tools, “stream”: False})
msg = response[“message”]
messages.append(msg)
if “tool_calls” in msg:
for tc in msg[“tool_calls”]:
fn = tc[“function”][“name”]
args = tc[“function”][“arguments”]
consequence = TOOL_FUNCTIONS[fn](**args)
messages.append({“function”: “software]]]”, “content material”: consequence, “title”: fn})
ultimate = call_ollama({“mannequin”: “gemma4:e2b”, “messages”: messages, “instruments”: all_tools, “stream”: False})
return ultimate[“message”][“content”]
return msg.get(“content material”, “”)
3. Execute a compound/multi-intent question
print(run_agent(
“I am flying to Tokyo tomorrow. What is the present time there, ”
“the climate, and the way a lot is 10000 INR in JPY?”
))e
Output
Right here, we described three distinct capabilities with three separate APIs in real-time via pure language processing utilizing one widespread idea. It consists of all native execution with out cloud options from the Gemma 4 occasion; none of those parts make the most of any distant sources or cloud.
What Makes Gemma 4 Totally different for Agentic AI?
Different open weight fashions can name instruments, but they don’t carry out reliably, and that is what differentiates them from Gemma 4. The mannequin constantly offers legitimate JSON arguments, processes optionally available parameters accurately, and determines when to return information and never name a software. As you retain utilizing it, take note the next:
- Schema high quality is critically essential. In case your description subject is imprecise, you should have a tough time figuring out arguments in your software. Be particular with models, codecs, and examples.
- The required array is validated by Gemma 4. Gemma 4 respects the wanted/optionally available distinction.
- As soon as the software returns a consequence, that consequence turns into a context for any of the “function”: “software” messages you ship throughout your ultimate cross. The richer the consequence from the software, the richer the response might be.
- A standard mistake is to return the software consequence as “function”: “consumer” as a substitute of “function”: “software”, because the mannequin is not going to attribute it accurately and can try and re-request the decision.
Additionally learn: High 10 Gemma 4 Initiatives That Will Blow Your Thoughts
Conclusion
You’ve gotten created an actual AI agent that makes use of the Gemma 4 function-calling function, and it’s working totally domestically. The agent-based system makes use of all of the parts of the structure in manufacturing. Potential subsequent steps can embody:
- including a file system software that can enable for studying and writing native recordsdata on demand;
- utilizing a SQL database as a method for making pure language information queries;
- making a reminiscence software that can create session summaries and write them to disk, thus offering the agent with the flexibility to recall previous conversations
The open-weight AI agent ecosystem is evolving shortly. The flexibility for Gemma 4 to natively help structured perform calling affords substantial autonomous performance to you with none reliance on the cloud. Begin small, create a working system, and the constructing blocks in your subsequent initiatives might be prepared so that you can chain collectively.
Technical content material strategist and communicator with a decade of expertise in content material creation and distribution throughout nationwide media, Authorities of India, and personal platforms
Login to proceed studying and luxuriate in expert-curated content material.
Maintain Studying for Free

