LaaLM-exp-v1-GGUF: Linux Terminal Emulation via Language Model

Quantized GGUF versions of LaaLM-exp-v1 - a revolutionary 3B parameter model that emulates a Linux terminal entirely through conversation.

What is LaaLM?

LaaLM (Linux as a Language Model) is an experimental AI model that learned to behave like a Linux terminal without any external code or state management. Unlike traditional terminal emulators that track files and directories with actual data structures, LaaLM maintains the entire filesystem state purely in its "memory" as a language model.

Think of it as teaching an AI to simulate a computer's filesystem by learning patterns from thousands of terminal sessions. The model learned:

  • Where files are located in the directory tree
  • What content each file contains
  • How commands modify the filesystem
  • When to show error messages for invalid operations

The Innovation: This proves that language models can learn to maintain complex, stateful systems through conversation context alone - no programming required, just learning from examples.

What Can It Do?

LaaLM supports 12 common Linux commands with 95.4% accuracy on benchmark tests:

Command What It Does Example
pwd Shows current directory pwd โ†’ /home/user
ls Lists files ls โ†’ file.txt folder/
cd Changes directory cd folder
mkdir Creates directory mkdir newfolder
touch Creates empty file touch document.txt
echo Prints text echo hello world
echo > Writes text to file echo "content" > file.txt
cat Shows file contents cat file.txt โ†’ content
grep Searches text in files grep "word" file.txt
cp Copies files cp source.txt dest.txt
mv Moves/renames files mv old.txt new.txt
rm Deletes files rm file.txt

Example Session:

$ pwd
/home/user

$ touch myfile.txt
$ ls
myfile.txt

$ echo "Hello, Linux!" > myfile.txt
$ cat myfile.txt
Hello, Linux!

$ mkdir documents
$ mv myfile.txt documents/
$ cd documents
$ pwd
/home/user/documents

$ ls
myfile.txt

The model remembers all these changes throughout the conversation - which files exist, where they are, and what's inside them.

Performance Benchmarks

Tested on 130 diverse scenarios across 6 categories:

Category Accuracy Tests Passed
Basic Commands (pwd, ls, echo) 100% 20/20
File Creation (touch, echo >) 100% 20/20
File Operations (rm, mv, cp) 100% 30/30
File Content (cat, grep) 100% 20/20
Error Handling 75% 15/20
State Persistence (multi-step) 95% 19/20

Overall: 95.4% (124/130 tests passed)

Available Quantizations

GGUF quantization compresses the model for faster CPU inference and lower memory usage. Choose based on your hardware and quality needs:

File Quant Level Size RAM Needed Best For
exp-v1-Q2_K.gguf Q2_K 1.27 GB ~2 GB Minimum resources, acceptable quality
exp-v1-Q3_K_S.gguf Q3_K_S 1.45 GB ~2 GB Small footprint
exp-v1-Q3_K_M.gguf Q3_K_M 1.59 GB ~2.5 GB Balanced small size
exp-v1-Q3_K_L.gguf Q3_K_L 1.71 GB ~2.5 GB Better Q3 quality
exp-v1-IQ4_XS.gguf IQ4_XS 1.75 GB ~3 GB High quality at small size
exp-v1-Q4_K_S.gguf Q4_K_S 1.83 GB ~3 GB Good balance
exp-v1-Q4_K_M.gguf Q4_K_M 1.93 GB ~3 GB โญ Recommended
exp-v1-Q5_K_S.gguf Q5_K_S 2.17 GB ~3.5 GB High quality
exp-v1-Q5_K_M.gguf Q5_K_M 2.22 GB ~3.5 GB Higher quality
exp-v1-Q6_K.gguf Q6_K 2.54 GB ~4 GB Near-original quality
exp-v1-Q8_0.gguf Q8_0 3.29 GB ~5 GB Maximum quality
exp-v1-fp16.gguf fp16 6.18 GB ~8 GB Original precision

Recommendation: Q4_K_M provides the best balance of quality and efficiency. Use Q6_K or Q8_0 if you need maximum accuracy.

Quality Expectations by Quantization Level

  • Q2_K - Q3_K: May occasionally make mistakes on complex file operations or long conversation histories
  • Q4_K_M - Q5_K_M: Near-original quality - recommended for most users
  • Q6_K - fp16: Virtually identical to original model performance

Installation & Usage

Method 1: llama.cpp (C++, fastest)

Download the model:

huggingface-cli download LaaLM/LaaLM-exp-v1-GGUF exp-v1-Q4_K_M.gguf --local-dir .

Run it:

./llama-cli -m exp-v1-Q4_K_M.gguf \
  --color \
  --temp 0 \
  -p "You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user

User: pwd
Assistant:"

Interactive mode:

./llama-cli -m exp-v1-Q4_K_M.gguf \
  -i \
  --temp 0 \
  --reverse-prompt "User:" \
  -p "You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"

Method 2: Ollama (easiest)

1. Create a Modelfile:

FROM ./exp-v1-Q4_K_M.gguf

SYSTEM """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""

PARAMETER temperature 0
PARAMETER top_p 1
PARAMETER stop "User:"

2. Create and run the model:

# Create the model
ollama create laalm-exp-v1 -f Modelfile

# Run interactively
ollama run laalm-exp-v1

3. Use it:

>>> pwd
/home/user

>>> touch readme.txt
(empty)

>>> echo "Welcome to LaaLM" > readme.txt
(empty)

>>> cat readme.txt
Welcome to LaaLM

>>> ls
readme.txt

Method 3: Python (llama-cpp-python)

Install:

pip install llama-cpp-python

Basic usage:

from llama_cpp import Llama

# Load model
llm = Llama(
    model_path="exp-v1-Q4_K_M.gguf",
    n_ctx=2048,        # Context window
    n_threads=8,       # CPU threads
    verbose=False
)

# System prompt - required for initialization
system_prompt = """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""

# Conversation history
conversation = system_prompt

def run_command(cmd):
    global conversation
    
    # Add command to conversation
    conversation += f"\n\nUser: {cmd}\nAssistant:"
    
    # Generate response
    output = llm(
        conversation,
        max_tokens=150,
        temperature=0.0,      # Deterministic output
        stop=["User:", "\n\n"]
    )
    
    response = output['choices'][0]['text'].strip()
    conversation += " " + response
    
    return response

# Example session
print(run_command("pwd"))           # /home/user
print(run_command("mkdir project")) # (empty)
print(run_command("cd project"))    # (empty)
print(run_command("pwd"))           # /home/user/project
print(run_command("touch main.py")) # (empty)
print(run_command("ls"))            # main.py

Advanced: Full terminal emulator:

from llama_cpp import Llama

class LinuxTerminal:
    def __init__(self, model_path, initial_dir="/home/user"):
        self.llm = Llama(
            model_path=model_path,
            n_ctx=2048,
            n_threads=8,
            verbose=False
        )
        
        self.conversation = f"""You are a Linux terminal emulator. Initial state:
Current directory: {initial_dir}
Files: (empty)
Environment: USER=user, HOME=/home/user"""
    
    def execute(self, command):
        """Execute a command and return the output"""
        self.conversation += f"\n\nUser: {command}\nAssistant:"
        
        output = self.llm(
            self.conversation,
            max_tokens=150,
            temperature=0.0,
            stop=["User:", "\n\n"]
        )
        
        response = output['choices'][0]['text'].strip()
        self.conversation += " " + response
        
        return response
    
    def run(self):
        """Interactive terminal session"""
        print("LaaLM Terminal Emulator - Type 'exit' to quit")
        print("=" * 50)
        
        while True:
            try:
                cmd = input("$ ")
                
                if cmd.lower() in ['exit', 'quit']:
                    break
                
                if cmd.strip():
                    output = self.execute(cmd)
                    if output:  # Only print non-empty outputs
                        print(output)
                        
            except KeyboardInterrupt:
                print("\nExiting...")
                break
            except Exception as e:
                print(f"Error: {e}")

# Usage
terminal = LinuxTerminal("exp-v1-Q4_K_M.gguf")
terminal.run()

Understanding the System Prompt

The system prompt is critical - it tells the model what the initial filesystem looks like. Without it, the model won't know where to start.

Required Format

You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user

Key Components

  1. Identity declaration - "You are a Linux terminal emulator"
  2. Starting directory - Usually /home/user
  3. Initial files - List existing files or write "(empty)"
  4. Environment variables - At minimum: USER and HOME

Starting with Existing Files

If you want to start with files already present:

You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: document.txt, script.sh, folder/data.csv
Environment: USER=user, HOME=/home/user

Important Rules

  • Set the system prompt only once at the start
  • Do not update it with current state - the model learns to track changes from command history
  • Include full conversation history when generating responses
  • Use temperature 0 for deterministic, consistent outputs

How It Works: The Technical Magic

Traditional Terminal Emulator (External State)

class Terminal:
    def __init__(self):
        self.cwd = "/home/user"
        self.files = {}  # Dictionary storing all files
    
    def execute(self, cmd):
        if cmd == "pwd":
            return self.cwd
        elif cmd.startswith("touch"):
            filename = cmd.split()[1]
            self.files[filename] = ""  # Explicit state update

LaaLM Approach (Internal State via Learning)

The model learned patterns like:

"User: touch file.txt" โ†’ creates file.txt in memory
"User: ls" โ†’ must remember file.txt exists
"User: cat file.txt" โ†’ must recall this file was created
"User: rm file.txt" โ†’ must remember to remove it
"User: ls" โ†’ file.txt should NOT appear anymore

The model doesn't have a files dictionary or any code. It learned these patterns from seeing 10,000 training conversations (800,000 individual messages) showing how files should behave.

Training Data

  • Base Model: Qwen 2.5-3B-Instruct
  • Training Examples: 10,000 synthetic terminal conversations
  • Commands per conversation: 30-50
  • Total messages: 800,000
  • Training method: Full fine-tuning (all parameters trained)
  • Precision: BFloat16 with Flash Attention 2
  • Hardware: Single A100 80GB GPU
  • Training time: 34 minutes
  • Cost: $0.68

Data generation used simulated Linux environments with:

  • Random realistic filenames
  • Diverse command sequences
  • Error cases (missing files, invalid commands)
  • Multi-step operations requiring memory
  • File content persistence across commands

Why This Matters for AI Research

This model demonstrates that language models can learn complex stateful systems without explicit programming:

  1. No code execution - Pure pattern matching
  2. No external state - Everything in conversation context
  3. Learned behavior - Emergent filesystem simulation
  4. Generalization - Works on command combinations not in training

This has implications for:

  • Building AI agents that can control software systems
  • Creating natural language interfaces for complex tools
  • Understanding how LLMs can learn to simulate computational processes
  • Research into emergent capabilities in transformers

Known Limitations

Command Support

  • Only 12 commands - No vim, nano, find, sed, awk, etc.
  • No advanced features:
    • No pipes (|) or command chaining (&&, ;)
    • No complex redirects (>>, 2>)
    • No variables, loops, or conditionals
    • No shell scripting

Specific Issues

  • File copying: cp occasionally fails to copy file content (only creates empty file)
  • Error handling: rm on non-existent files sometimes returns empty output instead of error message
  • Long conversations: After 50+ commands, state tracking may degrade
  • Long filenames: Names over 30 characters can cause parsing issues

Scope Constraints

  • No actual execution - This is simulation, not a real terminal
  • Requires full history - The model needs the entire conversation to track state
  • Context limits - Very long sessions may exceed the model's context window
  • Training distribution - Performance may drop on unusual command patterns

Use Cases

1. Linux Education

Interactive learning environment for teaching Linux commands without needing a real system:

# Teaching tool that explains each command
def educational_terminal(cmd):
    output = terminal.execute(cmd)
    print(f"Command: {cmd}")
    print(f"Output: {output}")
    print(f"Explanation: {get_explanation(cmd)}")

2. Shell Script Validation

Test scripts in simulation before running them:

$ echo "#!/bin/bash" > backup.sh
$ echo "cp important.txt backup/" >> backup.sh
$ cat backup.sh
#!/bin/bash
cp important.txt backup/

3. AI Agent Foundation

Use as a component in larger AI systems that need filesystem interaction:

class AIAgent:
    def __init__(self):
        self.terminal = LinuxTerminal("model.gguf")
    
    def organize_files(self, task):
        # AI generates commands to organize files
        commands = self.plan_organization(task)
        for cmd in commands:
            self.terminal.execute(cmd)

4. Research Platform

Study how language models learn stateful behavior:

  • Test emergent capabilities
  • Analyze error patterns
  • Investigate context length effects
  • Explore state tracking mechanisms

5. Accessibility Interface

Natural language terminal for users unfamiliar with command-line:

def natural_language_command(intent):
    # "create a file called notes" โ†’ "touch notes.txt"
    # "show me what's here" โ†’ "ls"
    cmd = intent_to_command(intent)
    return terminal.execute(cmd)

Project Lineage: LaaLM Evolution

LaaLM-v1 (State-Based Approach)

  • Architecture: T5-base (220M parameters)
  • Training data: 80,000 examples
  • Method: External filesystem state tracking
  • Approach: Model generates state transitions explicitly

LaaLM-exp-v1 (Current - Conversation-Based)

  • Architecture: Qwen 2.5-3B-Instruct
  • Training data: 800,000 messages (10,000 conversations)
  • Method: Internal state tracking through conversation
  • Approach: Model infers state from command history

LaaLM-v2 (Planned)

  • Features: Bash scripting, pipes, command chaining
  • Commands: Expanded command set (50+ commands)
  • Capabilities: Variables, loops, conditionals

Best Practices for Inference

  1. Always use the proper system prompt format - Don't skip it or modify it mid-conversation
  2. Set temperature=0 - Ensures deterministic, consistent outputs
  3. Enable fix_mistral_regex=True when using tokenizer (for transformers library)
  4. Maintain full conversation history - The model needs all previous commands to track state
  5. Limit max_tokens to ~150 - Commands rarely need longer outputs
  6. Use greedy decoding (do_sample=False) for predictable behavior
  7. Start fresh for new sessions - Don't reuse conversation context across unrelated tasks

Performance Tips

CPU Inference Optimization

llm = Llama(
    model_path="exp-v1-Q4_K_M.gguf",
    n_ctx=2048,
    n_threads=8,              # Match your CPU cores
    n_batch=512,              # Batch size for prompt processing
    use_mlock=True,           # Lock model in RAM (prevents swapping)
    use_mmap=True,            # Memory-map the model file
    verbose=False
)

GPU Acceleration

# Requires llama-cpp-python built with GPU support
llm = Llama(
    model_path="exp-v1-Q4_K_M.gguf",
    n_gpu_layers=32,          # Offload layers to GPU
    n_ctx=2048,
    verbose=False
)

Reducing Memory Usage

  • Use lower quantizations (Q3_K_M or Q4_K_S)
  • Reduce n_ctx if you don't need long conversations
  • Decrease n_batch (trades speed for memory)

Frequently Asked Questions

Q: Can this actually execute commands on my system?
A: No! This is pure simulation. The model learned patterns of how Linux commands work, but it doesn't execute anything. It's completely safe.

Q: Why does it sometimes make mistakes?
A: The model learned from examples, not from actual code. It's doing pattern matching, so occasionally it makes incorrect predictions, especially with complex multi-step operations.

Q: Can I use this instead of a real terminal?
A: No - this is for learning, prototyping, and research. For actual file management, use a real terminal.

Q: How long can conversations be?
A: The model was trained on 30-50 command sequences. It can handle more, but accuracy may degrade after 50-60 commands or when approaching the context limit.

Q: Can I train it on more commands?
A: Yes! The original model (non-GGUF) can be fine-tuned further. See the main model card for training details.

Q: Which quantization should I use?
A: Start with Q4_K_M. If you need better quality and have RAM, try Q6_K. If you're resource-constrained, try Q3_K_M.

Q: Does it work with other GGUF tools?
A: Yes! Any GGUF-compatible inference engine should work (llama.cpp, Ollama, text-generation-webui, LM Studio, etc.)

Technical Specifications

Model Details

  • Architecture: Qwen 2 (qwen2)
  • Parameters: 3.09 billion (3085.9M)
  • Model Class: AutoModelForCausalLM
  • Base Model: Qwen/Qwen2.5-3B-Instruct
  • Context Length: 2048 tokens (expandable)
  • Vocabulary Size: 151,936 tokens

Quantization Details

  • Format: GGUF (GPT-Generated Unified Format)
  • Quantization Tool: llama.cpp
  • Compatible Engines: llama.cpp, Ollama, llama-cpp-python, text-generation-webui, LM Studio, Koboldcpp, and more

Benchmark Environment

  • Test Suite: 130 automated tests
  • Categories: 6 (Basic, Creation, Operations, Content, Errors, Persistence)
  • Evaluation Method: Exact output matching
  • Base Model Score: 95.4% (124/130 passed)

License

Apache 2.0 - Free for commercial and non-commercial use

Inherited from the Qwen 2.5 base model.

Links & Resources

Acknowledgments

Built on Qwen 2.5-3B-Instruct by Alibaba Cloud. Quantized using llama.cpp by Georgi Gerganov and contributors.


Last Updated: January 22, 2026
Model Version: exp-v1
GGUF Quantizations: 12 variants (Q2_K through fp16)

Downloads last month
912
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LaaLM/LaaLM-exp-v1-GGUF

Base model

Qwen/Qwen2.5-3B
Finetuned
LaaLM/LaaLM-exp-v1
Quantized
(3)
this model