Agent Gemma โ Gemma 3n E2B Fine-Tuned for Function Calling
A fine-tuned version of google/gemma-3n-E2B-it trained for on-device function calling using Google's FunctionGemma technique.
What's Different from Stock Gemma 3n
Fixed: format_function_declaration Template Error
The stock Gemma 3n chat template uses format_function_declaration() โ a custom Jinja2 function available in Google's Python tokenizer but not supported by LiteRT-LM's on-device template engine. This causes:
Failed to apply template: unknown function: format_function_declaration is unknown (in template:21)
This model replaces the stock template with a LiteRT-LM compatible template that uses only standard Jinja2 features (tojson filter, <start_function_declaration> / <end_function_declaration> markers). The template is embedded in both tokenizer_config.json and chat_template.jinja.
Function Calling Format
The model uses the FunctionGemma markup format:
<start_function_call>call:function_name{param:<escape>value<escape>}<end_function_call>
Tool declarations are formatted as:
<start_function_declaration>{"name": "get_weather", "parameters": {...}}<end_function_declaration>
Training Details
- Base model: google/gemma-3n-E2B-it (5.4B parameters)
- Method: QLoRA (rank=16, alpha=32) โ 22.9M trainable parameters (0.42%)
- Dataset: google/mobile-actions (8,693 training samples)
- Training: 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4
- Precision: bfloat16
Usage
With LiteRT-LM on Android (Kotlin)
// After converting to .litertlm format
val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm"))
engine.initialize()
val conversation = engine.createConversation(
ConversationConfig(
systemMessage = Message.of("You are a helpful assistant."),
tools = listOf(MyToolSet()) // @Tool annotated class
)
)
// No format_function_declaration error!
conversation.sendMessageAsync(Message.of("What's the weather?"))
.collect { print(it) }
With Transformers (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma")
tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma")
messages = [
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "What's the weather in Tokyo?"}
]
tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]
text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))
Chat Template
The custom chat template (in tokenizer_config.json and chat_template.jinja) supports these roles:
developer/systemโ system instructions + tool declarationsuserโ user messagesmodel/assistantโ model responses, includingtool_callstoolโ tool execution results
Converting to .litertlm
Use the LiteRT-LM conversion tools to package for on-device deployment:
# The chat_template.jinja is included in this repo
python scripts/convert-to-litertlm.py \
--model_dir kontextdev/agent-gemma \
--output agent-gemma.litertlm
Files
model-*.safetensorsโ Merged model weights (bfloat16)tokenizer_config.jsonโ Tokenizer config with embedded chat templatechat_template.jinjaโ Standalone chat template fileconfig.jsonโ Model architecture configcheckpoint-*โ Training checkpoints (LoRA)
License
This model inherits the Gemma license from the base model.
- Downloads last month
- 7