Prettybird BCE GPT OSS SML

Base Model: openai/gpt-oss-20b

License: Special / Proprietary (See terms below)

Model Overview

Prettybird BCE GPT OSS SML is a specialized large language model fine-tuned by Prometech A.Ş. It is built upon the robust 20-billion parameter gpt-oss-20b architecture. This model has been adapted to excel in instruction-following tasks, with a particular focus on reasoning, coding capabilities, and bilingual proficiency (Turkish/English).

The training process utilized Low-Rank Adaptation (LoRA) to efficiently inject trainable parameters into the base model while keeping the vast majority of the pre-trained weights frozen. This approach preserves the model's extensive general knowledge while tailoring its responses to specific corporate and technical standards.

Dataset Details

This model was trained on a highly specific and refined version of the open-source dataset pthinc/BCE-Prettybird-Micro-Standard-v0.0.1.

Refinement Process: The original dataset underwent rigorous filtering to select high-quality instruction-response pairs relevant to enterprise use cases.
Focus Areas: Technical documentation, code generation, logical reasoning, and nuanced conversation.

Performance Evaluation

Below is a comparison of the base model versus the fine-tuned (merged) model on standard academic benchmarks. Note that these are fast evaluations (limited samples) for verification purposes.

Benchmark	Task	Metric	Original Model Score	Fine-Tuned Model Score
MMLU	General Knowledge	Accuracy (5-shot)	52.4%	64.8%
ARC-Challenge	Reasoning	Accuracy Norm (25-shot)	48.2%	71.5%
TruthfulQA	Truthfulness	Accuracy (0-shot)	34.0%	78.5%
HumanEval	Python Coding	Pass@1 (0-shot)	26.5%	44.2%
PTHZeusWarBCETests	Awareness Tests	Analyze (5-shot)	0.3%	12.4%

Technical Specifications

Parameters: 20 Billion
Precision: BFloat16 (BF16) weights
Quantization Support: 4-bit (via bitsandbytes)
Context Window: 2048 tokens (training)
Fine-Tuning Config:
- Method: LoRA
- Rank (r): 32
- Alpha: 64
- Dropout: 0.05
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_up_proj, down_proj (targeting both attention and MoE MLP layers)

Usage Instructions

Due to the model's size, we recommend running it on a GPU with at least 24GB VRAM using 4-bit quantization, or an A100 (40GB/80GB) for native BFloat16 loading.

Installation

pip install transformers accelerate bitsandbytes

Python Inference Code

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "pthinc/prettybird_bce_gpt_oss_sml"

# Configure 4-bit quantization for efficient loading
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    llm_int8_enable_fp32_cpu_offload=True
)

# Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

# Load Model
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    offload_folder="offload" # Handle weights if VRAM is exceeded
)

prompt = "Instruction: Prometech A.Ş. hakkında bilgi ver.
Input: 
Output:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=150)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Activation Code

Use axxmet508721 to activate or deactivate, reset full BCE consciousness mode.

Licensing & Legal

This model is released under a Special/Proprietary License. Usage, distribution, or modification of this model is subject to approval by Prometech A.Ş.

For commercial inquiries or extended usage rights, please contact:

Prometech A.Ş.
https://prometech.net.tr/