Agent Gemma โ€” Gemma 3n E2B Fine-Tuned for Function Calling

A fine-tuned version of google/gemma-3n-E2B-it trained for on-device function calling using Google's FunctionGemma technique.

What's Different from Stock Gemma 3n

Fixed: format_function_declaration Template Error

The stock Gemma 3n chat template uses format_function_declaration() โ€” a custom Jinja2 function available in Google's Python tokenizer but not supported by LiteRT-LM's on-device template engine. This causes:

Failed to apply template: unknown function: format_function_declaration is unknown (in template:21)

This model replaces the stock template with a LiteRT-LM compatible template that uses only standard Jinja2 features (tojson filter, <start_function_declaration> / <end_function_declaration> markers). The template is embedded in both tokenizer_config.json and chat_template.jinja.

Function Calling Format

The model uses the FunctionGemma markup format:

<start_function_call>call:function_name{param:<escape>value<escape>}<end_function_call>

Tool declarations are formatted as:

<start_function_declaration>{"name": "get_weather", "parameters": {...}}<end_function_declaration>

Training Details

  • Base model: google/gemma-3n-E2B-it (5.4B parameters)
  • Method: QLoRA (rank=16, alpha=32) โ€” 22.9M trainable parameters (0.42%)
  • Dataset: google/mobile-actions (8,693 training samples)
  • Training: 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4
  • Precision: bfloat16

Usage

With LiteRT-LM on Android (Kotlin)

// After converting to .litertlm format
val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm"))
engine.initialize()

val conversation = engine.createConversation(
    ConversationConfig(
        systemMessage = Message.of("You are a helpful assistant."),
        tools = listOf(MyToolSet())  // @Tool annotated class
    )
)

// No format_function_declaration error!
conversation.sendMessageAsync(Message.of("What's the weather?"))
    .collect { print(it) }

With Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma")
tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma")

messages = [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather in Tokyo?"}
]

tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))

Chat Template

The custom chat template (in tokenizer_config.json and chat_template.jinja) supports these roles:

  • developer / system โ€” system instructions + tool declarations
  • user โ€” user messages
  • model / assistant โ€” model responses, including tool_calls
  • tool โ€” tool execution results

Converting to .litertlm

Use the LiteRT-LM conversion tools to package for on-device deployment:

# The chat_template.jinja is included in this repo
python scripts/convert-to-litertlm.py \
  --model_dir kontextdev/agent-gemma \
  --output agent-gemma.litertlm

Files

  • model-*.safetensors โ€” Merged model weights (bfloat16)
  • tokenizer_config.json โ€” Tokenizer config with embedded chat template
  • chat_template.jinja โ€” Standalone chat template file
  • config.json โ€” Model architecture config
  • checkpoint-* โ€” Training checkpoints (LoRA)

License

This model inherits the Gemma license from the base model.

Downloads last month
7
Safetensors
Model size
5B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kontextdev/agent-gemma

Finetuned
(35)
this model