Roblox Luau Mistral 7B β SFT
Recommended: Use the improved RFT version instead. The RFT model scores higher across all dimensions (+5% composite) thanks to reinforcement fine-tuning with Claude-as-judge reward scoring.
A supervised fine-tuned LoRA adapter for generating production-ready Roblox Luau scripts. Built on Mistral-7B-Instruct-v0.3 and trained on curated Luau code from the-luau-stack + Claude-generated gold examples.
Part of the Roblox Luau Code Generator project for the W&B Fine-Tuning Hackathon.
What it does
Given a natural language description of a Roblox game feature, this model generates complete, runnable Luau scripts that follow Roblox best practices β proper service access, modern API usage, error handling, and clean code structure.
Training
Data Pipeline
- the-luau-stack (300 examples) β High-quality Luau scripts filtered from TorpedoSoftware/the-luau-stack using quality heuristics (minimum length, function/comment density, Roblox API usage signals)
- Reverse labeling β Claude Sonnet 4.5 generated task descriptions for each filtered script, creating (task, code) pairs
- Gold examples (50 tasks) β Claude Sonnet 4.5 generated reference implementations for 50 hand-written Roblox development tasks spanning NPC behavior, game mechanics, data persistence, UI systems, physics, and networking
- Quality filtering β All examples scored by 4 deterministic scorers (syntax, API correctness, bug detection, code quality). Only examples scoring β₯0.75 (stack) / β₯0.80 (gold) were kept
Training Config
| Parameter | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Method | QLoRA (4-bit NF4) |
| LoRA rank | 64 |
| LoRA alpha | 128 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Dropout | 0.05 |
| Epochs | 3 |
| Batch size | 1 (Γ8 gradient accumulation) |
| Learning rate | 2e-4 |
| Max sequence length | 8192 |
| Precision | bf16 |
| Gradient checkpointing | Yes |
Scorers
The model was evaluated with 4 deterministic scorers:
- Syntax β Bracket/block balance, Python-ism detection
- API Correctness β
game:GetService()usage, deprecated API detection, valid service names - Bug Detection β Unchecked
FindFirstChild, infinite loops without yield, DataStore calls withoutpcall, global variables - Code Quality β Comment density, section organization, naming conventions, completeness
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "squaredcuber/roblox-luau-mistral-7b")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")
messages = [
{"role": "system", "content": "You are an expert Roblox Luau programmer. Generate complete, production-ready Luau scripts. Output only code, no markdown."},
{"role": "user", "content": "Create a coin collection system with leaderstats, sound effects, and respawning coins"},
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
with torch.no_grad():
output = model.generate(inputs, max_new_tokens=4096, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True))
With vLLM (recommended for serving)
python -m vllm.entrypoints.openai.api_server \
--model mistralai/Mistral-7B-Instruct-v0.3 \
--enable-lora \
--lora-modules sft=squaredcuber/roblox-luau-mistral-7b \
--max-lora-rank 64
Intended Use
- Generating Roblox Luau scripts from natural language descriptions
- Rapid prototyping of game features in Roblox Studio
- Educational tool for learning Roblox development patterns
Limitations
- Trained primarily on server-side scripts β client-side LocalScript patterns may be weaker
- May occasionally use deprecated API patterns despite training emphasis on modern APIs
- Complex multi-script architectures (ModuleScript dependencies) may not always be coherent
- Generated code should be reviewed before use in production games
Related Models
- RFT version: squaredcuber/roblox-luau-mistral-7b-rft β Reinforcement fine-tuned with Claude-as-judge scoring, generally produces higher quality output
- Downloads last month
- 35
Model tree for squaredcuber/roblox-luau-mistral-7b
Base model
mistralai/Mistral-7B-v0.3Dataset used to train squaredcuber/roblox-luau-mistral-7b
Evaluation results
- Syntax Scoreself-reported0.920
- API Correctnessself-reported0.880
- Bug-Free Scoreself-reported0.850
- Quality Scoreself-reported0.820
- Composite Scoreself-reported0.870