ABC-Bench
Collection
Evaluating Agentic Backend Coding Capabilities in Real-World Development Scenarios • 4 items • Updated • 4
How to use OpenMOSS-Team/Qwen3-32B-ABC with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="OpenMOSS-Team/Qwen3-32B-ABC")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("OpenMOSS-Team/Qwen3-32B-ABC")
model = AutoModelForCausalLM.from_pretrained("OpenMOSS-Team/Qwen3-32B-ABC")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use OpenMOSS-Team/Qwen3-32B-ABC with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenMOSS-Team/Qwen3-32B-ABC"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "OpenMOSS-Team/Qwen3-32B-ABC",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/OpenMOSS-Team/Qwen3-32B-ABC
How to use OpenMOSS-Team/Qwen3-32B-ABC with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "OpenMOSS-Team/Qwen3-32B-ABC" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "OpenMOSS-Team/Qwen3-32B-ABC",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "OpenMOSS-Team/Qwen3-32B-ABC" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "OpenMOSS-Team/Qwen3-32B-ABC",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use OpenMOSS-Team/Qwen3-32B-ABC with Docker Model Runner:
docker model run hf.co/OpenMOSS-Team/Qwen3-32B-ABC
Qwen3-32B-ABC is a supervised fine-tuned (SFT) variant of Qwen/Qwen3-32B, trained for agentic backend coding and tool-using / instruction-following behaviors.
Qwen3-32B-ABCQwen/Qwen3-32BThis model was fine-tuned on nex-agi/agent-sft.
Please refer to the dataset card for detailed documentation, licensing, and usage constraints.
Following the ABC-Bench paper’s evaluation protocol:
| Model | Setting | Average Pass@1 (%, 3 attempts) |
|---|---|---|
| Qwen3-32B-ABC | w/ SFT | 33.8% |
| Qwen3-32B | w/o SFT | 8.9% |
Qwen3-32B-ABC is intended for:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "OpenMOSS-Team/Qwen3-32B-ABC"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
prompt = "Write a FastAPI endpoint that returns health status as JSON."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
@misc{yang2026abcbenchbenchmarkingagenticbackend,
title={ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development},
author={Jie Yang and Honglin Guo and Li Ji and Jiazheng Zhou and Rui Zheng and Zhikai Lei and Shuo Zhang and Zhiheng Xi and Shichun Liu and Yuxin Wang and Bo Wang and Yining Zheng and Tao Gui and Xipeng Qiu},
year={2026},
eprint={2601.11077},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2601.11077},
}
Qwen/Qwen3-32Bnex-agi/agent-sftBase model
Qwen/Qwen3-32B