Roblox Luau Mistral 7B β€” SFT

Recommended: Use the improved RFT version instead. The RFT model scores higher across all dimensions (+5% composite) thanks to reinforcement fine-tuning with Claude-as-judge reward scoring.

A supervised fine-tuned LoRA adapter for generating production-ready Roblox Luau scripts. Built on Mistral-7B-Instruct-v0.3 and trained on curated Luau code from the-luau-stack + Claude-generated gold examples.

Part of the Roblox Luau Code Generator project for the W&B Fine-Tuning Hackathon.

What it does

Given a natural language description of a Roblox game feature, this model generates complete, runnable Luau scripts that follow Roblox best practices β€” proper service access, modern API usage, error handling, and clean code structure.

Training

Data Pipeline

  1. the-luau-stack (300 examples) β€” High-quality Luau scripts filtered from TorpedoSoftware/the-luau-stack using quality heuristics (minimum length, function/comment density, Roblox API usage signals)
  2. Reverse labeling β€” Claude Sonnet 4.5 generated task descriptions for each filtered script, creating (task, code) pairs
  3. Gold examples (50 tasks) β€” Claude Sonnet 4.5 generated reference implementations for 50 hand-written Roblox development tasks spanning NPC behavior, game mechanics, data persistence, UI systems, physics, and networking
  4. Quality filtering β€” All examples scored by 4 deterministic scorers (syntax, API correctness, bug detection, code quality). Only examples scoring β‰₯0.75 (stack) / β‰₯0.80 (gold) were kept

Training Config

Parameter Value
Base model mistralai/Mistral-7B-Instruct-v0.3
Method QLoRA (4-bit NF4)
LoRA rank 64
LoRA alpha 128
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dropout 0.05
Epochs 3
Batch size 1 (Γ—8 gradient accumulation)
Learning rate 2e-4
Max sequence length 8192
Precision bf16
Gradient checkpointing Yes

Scorers

The model was evaluated with 4 deterministic scorers:

  • Syntax β€” Bracket/block balance, Python-ism detection
  • API Correctness β€” game:GetService() usage, deprecated API detection, valid service names
  • Bug Detection β€” Unchecked FindFirstChild, infinite loops without yield, DataStore calls without pcall, global variables
  • Code Quality β€” Comment density, section organization, naming conventions, completeness

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "squaredcuber/roblox-luau-mistral-7b")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")

messages = [
    {"role": "system", "content": "You are an expert Roblox Luau programmer. Generate complete, production-ready Luau scripts. Output only code, no markdown."},
    {"role": "user", "content": "Create a coin collection system with leaderstats, sound effects, and respawning coins"},
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
with torch.no_grad():
    output = model.generate(inputs, max_new_tokens=4096, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True))

With vLLM (recommended for serving)

python -m vllm.entrypoints.openai.api_server \
    --model mistralai/Mistral-7B-Instruct-v0.3 \
    --enable-lora \
    --lora-modules sft=squaredcuber/roblox-luau-mistral-7b \
    --max-lora-rank 64

Intended Use

  • Generating Roblox Luau scripts from natural language descriptions
  • Rapid prototyping of game features in Roblox Studio
  • Educational tool for learning Roblox development patterns

Limitations

  • Trained primarily on server-side scripts β€” client-side LocalScript patterns may be weaker
  • May occasionally use deprecated API patterns despite training emphasis on modern APIs
  • Complex multi-script architectures (ModuleScript dependencies) may not always be coherent
  • Generated code should be reviewed before use in production games

Related Models

Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for squaredcuber/roblox-luau-mistral-7b

Adapter
(790)
this model

Dataset used to train squaredcuber/roblox-luau-mistral-7b

Evaluation results