LaaLM-exp-v1-GGUF: Linux Terminal Emulation via Language Model
Quantized GGUF versions of LaaLM-exp-v1 - a revolutionary 3B parameter model that emulates a Linux terminal entirely through conversation.
What is LaaLM?
LaaLM (Linux as a Language Model) is an experimental AI model that learned to behave like a Linux terminal without any external code or state management. Unlike traditional terminal emulators that track files and directories with actual data structures, LaaLM maintains the entire filesystem state purely in its "memory" as a language model.
Think of it as teaching an AI to simulate a computer's filesystem by learning patterns from thousands of terminal sessions. The model learned:
- Where files are located in the directory tree
- What content each file contains
- How commands modify the filesystem
- When to show error messages for invalid operations
The Innovation: This proves that language models can learn to maintain complex, stateful systems through conversation context alone - no programming required, just learning from examples.
What Can It Do?
LaaLM supports 12 common Linux commands with 95.4% accuracy on benchmark tests:
| Command | What It Does | Example |
|---|---|---|
pwd |
Shows current directory | pwd โ /home/user |
ls |
Lists files | ls โ file.txt folder/ |
cd |
Changes directory | cd folder |
mkdir |
Creates directory | mkdir newfolder |
touch |
Creates empty file | touch document.txt |
echo |
Prints text | echo hello world |
echo > |
Writes text to file | echo "content" > file.txt |
cat |
Shows file contents | cat file.txt โ content |
grep |
Searches text in files | grep "word" file.txt |
cp |
Copies files | cp source.txt dest.txt |
mv |
Moves/renames files | mv old.txt new.txt |
rm |
Deletes files | rm file.txt |
Example Session:
$ pwd
/home/user
$ touch myfile.txt
$ ls
myfile.txt
$ echo "Hello, Linux!" > myfile.txt
$ cat myfile.txt
Hello, Linux!
$ mkdir documents
$ mv myfile.txt documents/
$ cd documents
$ pwd
/home/user/documents
$ ls
myfile.txt
The model remembers all these changes throughout the conversation - which files exist, where they are, and what's inside them.
Performance Benchmarks
Tested on 130 diverse scenarios across 6 categories:
| Category | Accuracy | Tests Passed |
|---|---|---|
| Basic Commands (pwd, ls, echo) | 100% | 20/20 |
| File Creation (touch, echo >) | 100% | 20/20 |
| File Operations (rm, mv, cp) | 100% | 30/30 |
| File Content (cat, grep) | 100% | 20/20 |
| Error Handling | 75% | 15/20 |
| State Persistence (multi-step) | 95% | 19/20 |
Overall: 95.4% (124/130 tests passed)
Available Quantizations
GGUF quantization compresses the model for faster CPU inference and lower memory usage. Choose based on your hardware and quality needs:
| File | Quant Level | Size | RAM Needed | Best For |
|---|---|---|---|---|
| exp-v1-Q2_K.gguf | Q2_K | 1.27 GB | ~2 GB | Minimum resources, acceptable quality |
| exp-v1-Q3_K_S.gguf | Q3_K_S | 1.45 GB | ~2 GB | Small footprint |
| exp-v1-Q3_K_M.gguf | Q3_K_M | 1.59 GB | ~2.5 GB | Balanced small size |
| exp-v1-Q3_K_L.gguf | Q3_K_L | 1.71 GB | ~2.5 GB | Better Q3 quality |
| exp-v1-IQ4_XS.gguf | IQ4_XS | 1.75 GB | ~3 GB | High quality at small size |
| exp-v1-Q4_K_S.gguf | Q4_K_S | 1.83 GB | ~3 GB | Good balance |
| exp-v1-Q4_K_M.gguf | Q4_K_M | 1.93 GB | ~3 GB | โญ Recommended |
| exp-v1-Q5_K_S.gguf | Q5_K_S | 2.17 GB | ~3.5 GB | High quality |
| exp-v1-Q5_K_M.gguf | Q5_K_M | 2.22 GB | ~3.5 GB | Higher quality |
| exp-v1-Q6_K.gguf | Q6_K | 2.54 GB | ~4 GB | Near-original quality |
| exp-v1-Q8_0.gguf | Q8_0 | 3.29 GB | ~5 GB | Maximum quality |
| exp-v1-fp16.gguf | fp16 | 6.18 GB | ~8 GB | Original precision |
Recommendation: Q4_K_M provides the best balance of quality and efficiency. Use Q6_K or Q8_0 if you need maximum accuracy.
Quality Expectations by Quantization Level
- Q2_K - Q3_K: May occasionally make mistakes on complex file operations or long conversation histories
- Q4_K_M - Q5_K_M: Near-original quality - recommended for most users
- Q6_K - fp16: Virtually identical to original model performance
Installation & Usage
Method 1: llama.cpp (C++, fastest)
Download the model:
huggingface-cli download LaaLM/LaaLM-exp-v1-GGUF exp-v1-Q4_K_M.gguf --local-dir .
Run it:
./llama-cli -m exp-v1-Q4_K_M.gguf \
--color \
--temp 0 \
-p "You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user
User: pwd
Assistant:"
Interactive mode:
./llama-cli -m exp-v1-Q4_K_M.gguf \
-i \
--temp 0 \
--reverse-prompt "User:" \
-p "You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"
Method 2: Ollama (easiest)
1. Create a Modelfile:
FROM ./exp-v1-Q4_K_M.gguf
SYSTEM """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""
PARAMETER temperature 0
PARAMETER top_p 1
PARAMETER stop "User:"
2. Create and run the model:
# Create the model
ollama create laalm-exp-v1 -f Modelfile
# Run interactively
ollama run laalm-exp-v1
3. Use it:
>>> pwd
/home/user
>>> touch readme.txt
(empty)
>>> echo "Welcome to LaaLM" > readme.txt
(empty)
>>> cat readme.txt
Welcome to LaaLM
>>> ls
readme.txt
Method 3: Python (llama-cpp-python)
Install:
pip install llama-cpp-python
Basic usage:
from llama_cpp import Llama
# Load model
llm = Llama(
model_path="exp-v1-Q4_K_M.gguf",
n_ctx=2048, # Context window
n_threads=8, # CPU threads
verbose=False
)
# System prompt - required for initialization
system_prompt = """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""
# Conversation history
conversation = system_prompt
def run_command(cmd):
global conversation
# Add command to conversation
conversation += f"\n\nUser: {cmd}\nAssistant:"
# Generate response
output = llm(
conversation,
max_tokens=150,
temperature=0.0, # Deterministic output
stop=["User:", "\n\n"]
)
response = output['choices'][0]['text'].strip()
conversation += " " + response
return response
# Example session
print(run_command("pwd")) # /home/user
print(run_command("mkdir project")) # (empty)
print(run_command("cd project")) # (empty)
print(run_command("pwd")) # /home/user/project
print(run_command("touch main.py")) # (empty)
print(run_command("ls")) # main.py
Advanced: Full terminal emulator:
from llama_cpp import Llama
class LinuxTerminal:
def __init__(self, model_path, initial_dir="/home/user"):
self.llm = Llama(
model_path=model_path,
n_ctx=2048,
n_threads=8,
verbose=False
)
self.conversation = f"""You are a Linux terminal emulator. Initial state:
Current directory: {initial_dir}
Files: (empty)
Environment: USER=user, HOME=/home/user"""
def execute(self, command):
"""Execute a command and return the output"""
self.conversation += f"\n\nUser: {command}\nAssistant:"
output = self.llm(
self.conversation,
max_tokens=150,
temperature=0.0,
stop=["User:", "\n\n"]
)
response = output['choices'][0]['text'].strip()
self.conversation += " " + response
return response
def run(self):
"""Interactive terminal session"""
print("LaaLM Terminal Emulator - Type 'exit' to quit")
print("=" * 50)
while True:
try:
cmd = input("$ ")
if cmd.lower() in ['exit', 'quit']:
break
if cmd.strip():
output = self.execute(cmd)
if output: # Only print non-empty outputs
print(output)
except KeyboardInterrupt:
print("\nExiting...")
break
except Exception as e:
print(f"Error: {e}")
# Usage
terminal = LinuxTerminal("exp-v1-Q4_K_M.gguf")
terminal.run()
Understanding the System Prompt
The system prompt is critical - it tells the model what the initial filesystem looks like. Without it, the model won't know where to start.
Required Format
You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user
Key Components
- Identity declaration - "You are a Linux terminal emulator"
- Starting directory - Usually
/home/user - Initial files - List existing files or write "(empty)"
- Environment variables - At minimum: USER and HOME
Starting with Existing Files
If you want to start with files already present:
You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: document.txt, script.sh, folder/data.csv
Environment: USER=user, HOME=/home/user
Important Rules
- Set the system prompt only once at the start
- Do not update it with current state - the model learns to track changes from command history
- Include full conversation history when generating responses
- Use temperature 0 for deterministic, consistent outputs
How It Works: The Technical Magic
Traditional Terminal Emulator (External State)
class Terminal:
def __init__(self):
self.cwd = "/home/user"
self.files = {} # Dictionary storing all files
def execute(self, cmd):
if cmd == "pwd":
return self.cwd
elif cmd.startswith("touch"):
filename = cmd.split()[1]
self.files[filename] = "" # Explicit state update
LaaLM Approach (Internal State via Learning)
The model learned patterns like:
"User: touch file.txt" โ creates file.txt in memory
"User: ls" โ must remember file.txt exists
"User: cat file.txt" โ must recall this file was created
"User: rm file.txt" โ must remember to remove it
"User: ls" โ file.txt should NOT appear anymore
The model doesn't have a files dictionary or any code. It learned these patterns from seeing 10,000 training conversations (800,000 individual messages) showing how files should behave.
Training Data
- Base Model: Qwen 2.5-3B-Instruct
- Training Examples: 10,000 synthetic terminal conversations
- Commands per conversation: 30-50
- Total messages: 800,000
- Training method: Full fine-tuning (all parameters trained)
- Precision: BFloat16 with Flash Attention 2
- Hardware: Single A100 80GB GPU
- Training time: 34 minutes
- Cost: $0.68
Data generation used simulated Linux environments with:
- Random realistic filenames
- Diverse command sequences
- Error cases (missing files, invalid commands)
- Multi-step operations requiring memory
- File content persistence across commands
Why This Matters for AI Research
This model demonstrates that language models can learn complex stateful systems without explicit programming:
- No code execution - Pure pattern matching
- No external state - Everything in conversation context
- Learned behavior - Emergent filesystem simulation
- Generalization - Works on command combinations not in training
This has implications for:
- Building AI agents that can control software systems
- Creating natural language interfaces for complex tools
- Understanding how LLMs can learn to simulate computational processes
- Research into emergent capabilities in transformers
Known Limitations
Command Support
- Only 12 commands - No
vim,nano,find,sed,awk, etc. - No advanced features:
- No pipes (
|) or command chaining (&&,;) - No complex redirects (
>>,2>) - No variables, loops, or conditionals
- No shell scripting
- No pipes (
Specific Issues
- File copying:
cpoccasionally fails to copy file content (only creates empty file) - Error handling:
rmon non-existent files sometimes returns empty output instead of error message - Long conversations: After 50+ commands, state tracking may degrade
- Long filenames: Names over 30 characters can cause parsing issues
Scope Constraints
- No actual execution - This is simulation, not a real terminal
- Requires full history - The model needs the entire conversation to track state
- Context limits - Very long sessions may exceed the model's context window
- Training distribution - Performance may drop on unusual command patterns
Use Cases
1. Linux Education
Interactive learning environment for teaching Linux commands without needing a real system:
# Teaching tool that explains each command
def educational_terminal(cmd):
output = terminal.execute(cmd)
print(f"Command: {cmd}")
print(f"Output: {output}")
print(f"Explanation: {get_explanation(cmd)}")
2. Shell Script Validation
Test scripts in simulation before running them:
$ echo "#!/bin/bash" > backup.sh
$ echo "cp important.txt backup/" >> backup.sh
$ cat backup.sh
#!/bin/bash
cp important.txt backup/
3. AI Agent Foundation
Use as a component in larger AI systems that need filesystem interaction:
class AIAgent:
def __init__(self):
self.terminal = LinuxTerminal("model.gguf")
def organize_files(self, task):
# AI generates commands to organize files
commands = self.plan_organization(task)
for cmd in commands:
self.terminal.execute(cmd)
4. Research Platform
Study how language models learn stateful behavior:
- Test emergent capabilities
- Analyze error patterns
- Investigate context length effects
- Explore state tracking mechanisms
5. Accessibility Interface
Natural language terminal for users unfamiliar with command-line:
def natural_language_command(intent):
# "create a file called notes" โ "touch notes.txt"
# "show me what's here" โ "ls"
cmd = intent_to_command(intent)
return terminal.execute(cmd)
Project Lineage: LaaLM Evolution
LaaLM-v1 (State-Based Approach)
- Architecture: T5-base (220M parameters)
- Training data: 80,000 examples
- Method: External filesystem state tracking
- Approach: Model generates state transitions explicitly
LaaLM-exp-v1 (Current - Conversation-Based)
- Architecture: Qwen 2.5-3B-Instruct
- Training data: 800,000 messages (10,000 conversations)
- Method: Internal state tracking through conversation
- Approach: Model infers state from command history
LaaLM-v2 (Planned)
- Features: Bash scripting, pipes, command chaining
- Commands: Expanded command set (50+ commands)
- Capabilities: Variables, loops, conditionals
Best Practices for Inference
- Always use the proper system prompt format - Don't skip it or modify it mid-conversation
- Set
temperature=0- Ensures deterministic, consistent outputs - Enable
fix_mistral_regex=Truewhen using tokenizer (for transformers library) - Maintain full conversation history - The model needs all previous commands to track state
- Limit
max_tokensto ~150 - Commands rarely need longer outputs - Use greedy decoding (
do_sample=False) for predictable behavior - Start fresh for new sessions - Don't reuse conversation context across unrelated tasks
Performance Tips
CPU Inference Optimization
llm = Llama(
model_path="exp-v1-Q4_K_M.gguf",
n_ctx=2048,
n_threads=8, # Match your CPU cores
n_batch=512, # Batch size for prompt processing
use_mlock=True, # Lock model in RAM (prevents swapping)
use_mmap=True, # Memory-map the model file
verbose=False
)
GPU Acceleration
# Requires llama-cpp-python built with GPU support
llm = Llama(
model_path="exp-v1-Q4_K_M.gguf",
n_gpu_layers=32, # Offload layers to GPU
n_ctx=2048,
verbose=False
)
Reducing Memory Usage
- Use lower quantizations (Q3_K_M or Q4_K_S)
- Reduce
n_ctxif you don't need long conversations - Decrease
n_batch(trades speed for memory)
Frequently Asked Questions
Q: Can this actually execute commands on my system?
A: No! This is pure simulation. The model learned patterns of how Linux commands work, but it doesn't execute anything. It's completely safe.
Q: Why does it sometimes make mistakes?
A: The model learned from examples, not from actual code. It's doing pattern matching, so occasionally it makes incorrect predictions, especially with complex multi-step operations.
Q: Can I use this instead of a real terminal?
A: No - this is for learning, prototyping, and research. For actual file management, use a real terminal.
Q: How long can conversations be?
A: The model was trained on 30-50 command sequences. It can handle more, but accuracy may degrade after 50-60 commands or when approaching the context limit.
Q: Can I train it on more commands?
A: Yes! The original model (non-GGUF) can be fine-tuned further. See the main model card for training details.
Q: Which quantization should I use?
A: Start with Q4_K_M. If you need better quality and have RAM, try Q6_K. If you're resource-constrained, try Q3_K_M.
Q: Does it work with other GGUF tools?
A: Yes! Any GGUF-compatible inference engine should work (llama.cpp, Ollama, text-generation-webui, LM Studio, etc.)
Technical Specifications
Model Details
- Architecture: Qwen 2 (qwen2)
- Parameters: 3.09 billion (3085.9M)
- Model Class: AutoModelForCausalLM
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Context Length: 2048 tokens (expandable)
- Vocabulary Size: 151,936 tokens
Quantization Details
- Format: GGUF (GPT-Generated Unified Format)
- Quantization Tool: llama.cpp
- Compatible Engines: llama.cpp, Ollama, llama-cpp-python, text-generation-webui, LM Studio, Koboldcpp, and more
Benchmark Environment
- Test Suite: 130 automated tests
- Categories: 6 (Basic, Creation, Operations, Content, Errors, Persistence)
- Evaluation Method: Exact output matching
- Base Model Score: 95.4% (124/130 passed)
License
Apache 2.0 - Free for commercial and non-commercial use
Inherited from the Qwen 2.5 base model.
Links & Resources
- Original Model: LaaLM/LaaLM-exp-v1 (BFloat16 version)
- Author: LaaLM
- Project: Linux as a Language Model (LaaLM)
- llama.cpp: GitHub
- Ollama: Website
Acknowledgments
Built on Qwen 2.5-3B-Instruct by Alibaba Cloud. Quantized using llama.cpp by Georgi Gerganov and contributors.
Last Updated: January 22, 2026
Model Version: exp-v1
GGUF Quantizations: 12 variants (Q2_K through fp16)
- Downloads last month
- 912