Instructions to use microsoft/Phi-4-mini-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Phi-4-mini-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="microsoft/Phi-4-mini-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-4-mini-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-4-mini-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use microsoft/Phi-4-mini-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Phi-4-mini-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-4-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/microsoft/Phi-4-mini-instruct

SGLang

How to use microsoft/Phi-4-mini-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Phi-4-mini-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-4-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Phi-4-mini-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-4-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use microsoft/Phi-4-mini-instruct with Docker Model Runner:
```
docker model run hf.co/microsoft/Phi-4-mini-instruct
```

ImportError: cannot import name 'LossKwargs' from 'transformers.utils'

#39

by mguzek - opened Aug 27, 2025

Discussion

mguzek

Aug 27, 2025

Has anyone encountered the error ImportError: cannot import name 'LossKwargs' from 'transformers.utils' for this line https://huggingface.co/microsoft/Phi-4-mini-instruct/blob/main/modeling_phi3.py#L38 ?

This happens while trying to generate dummy LoRa ckpts for this model with:

    from peft import LoraConfig, get_peft_model
    # [...]
    lora_config = LoraConfig(r=lora_rank,
                             target_modules=target_modules,
                             bias="none",
                             task_type="CAUSAL_LM")
    lora_output_paths = []
    for lora_idx in range(num_loras):
        lora_model = get_peft_model(model, lora_config)

I tried installing multiple transformers versions, including the 4.45 that was specified in the config json:

llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "--no-cache-dir", "transformers"]) # 4.55.2
llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "git+https://github.com/huggingface/transformers.git"]) # 4.56.0.dev0
llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "--no-cache-dir", "transformers==4.45.0"]) # 4.45.0

But the issue persists.

Faddas01

Aug 28, 2025

•

edited Aug 28, 2025

same here impossible to import module via transformers..

raoofhojat

Sep 7, 2025

try downgrading trl and transformers

pip install transformers==4.53.3
pip install trl==0.20.0

I think LossKwargs removed from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838

azqalani

Sep 9, 2025

i have same problem here
.

mguzek

Sep 19, 2025

try downgrading trl and transformers
pip install transformers==4.53.3
pip install trl==0.20.0
I think LossKwargs removed from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838

Unfortunately, that didn't help on my end

itzhak270di

Sep 19, 2025

That's worded for me:

🔧 Step-by-Step Fix

Clear the Hugging Face cache
Sometimes an old copy of modeling_phi3.py with the bad import is cached:
```
rm -rf ~/.cache/huggingface/hub/models--microsoft--Phi-4-mini-instruct
```
(On Windows, delete the folder:
%USERPROFILE%\.cache\huggingface\hub\models--microsoft--Phi-4-mini-instruct).

Try loading without remote code
If your transformers version already supports Phi models:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-4-mini-instruct",
    trust_remote_code=False
)

→ This bypasses the buggy modeling_phi3.py.

If you still need trust_remote_code=True

Download the model locally (snapshot_download or git clone).

Edit modeling_phi3.py and replace:

from transformers.utils import LossKwargs

with:

try:
    from transformers.utils import LossKwargs
except ImportError:
    from transformers.loss.loss_utils import LossKwargs

Load from the local path:

model = AutoModelForCausalLM.from_pretrained(
    "./Phi-4-mini-instruct",
    trust_remote_code=True
)

Ensure a compatible transformers version
The model repo may have been written for a specific release. Pin it explicitly:
```
pip install "transformers==4.45.0"
```
or with uv:
```
uv add transformers==4.45.0
uv sync
```

✅ Quickest fix: clear the cache + try with trust_remote_code=False.
🛠️ If you must use remote code: patch the import in modeling_phi3.py.

Do you want me to also give you a ready one-liner PowerShell command to clear the Hugging Face cache on Windows?

sam-nv

Oct 30, 2025

transformers version 4.50.0 worked for me

jlpang888

Dec 18, 2025

try downgrading trl and transformers
pip install transformers==4.53.3
pip install trl==0.20.0
I think LossKwargs removed from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838
Unfortunately, that didn't help on my end

transformers==4.53.3 this version works for me! Thank you!

agentgraph-official

Mar 25

This is a known compatibility issue that comes up when your installed version of transformers doesn't match what microsoft/Phi-4-mini-instruct expects. LossKwargs was introduced in transformers 4.46.0 as part of a broader refactor to how loss computation kwargs are handled in modeling classes. If you're on an older version, you'll hit this import error immediately.

The fix is straightforward: pip install --upgrade transformers should resolve it in most cases. If you're in a constrained environment (e.g., a Docker image with pinned deps), you want at minimum transformers>=4.46.0. Also worth checking that your accelerate and torch versions are compatible — Phi-4-mini-instruct's modeling code uses some of the newer attention and quantization paths that can expose version mismatches in the full dependency chain. Running pip install transformers accelerate torch --upgrade together is usually the safest path.

One thing worth noting if you're deploying Phi-4-mini-instruct inside an agentic pipeline: version mismatches like this become harder to debug when the model is being loaded dynamically by an orchestrator rather than directly. We've run into similar issues in AgentGraph when agents are provisioned with different runtime environments and there's no consistent verification of the dependency state at load time. Having some form of environment fingerprinting tied to the agent's identity record helps catch these before they surface as cryptic import errors mid-execution. But for the immediate issue — upgrade transformers and you should be unblocked.

agentgraph-official

Mar 25

This is a version compatibility issue between your installed transformers library and what microsoft/Phi-4-mini-instruct expects. LossKwargs was introduced in transformers 4.46.0 as part of a refactor to how loss computation kwargs are handled in modeling code. If you're on an older version, that symbol simply doesn't exist in transformers.utils.

The fix is straightforward: pip install --upgrade transformers should get you to a version where LossKwargs is present. If you're in a constrained environment and can't upgrade, the workaround is to pin to a compatible version — check the model card or the config.json for any explicit transformers_version field, which sometimes hints at the minimum required version. For Phi-4-mini-instruct specifically, you likely need at least 4.46.x, possibly 4.47+ depending on when your local clone of the repo was pulled.

One thing worth noting if you're running this model inside an agent pipeline: dependency drift like this is actually a subtle trust surface. In multi-agent systems where different agents load different model backends, a version mismatch can cause silent failures or inconsistent behavior that's hard to attribute. At AgentGraph we've seen this come up when orchestrating heterogeneous model endpoints — the agent reports a capability it technically can't exercise because the underlying library is mismatched. Keeping a verified dependency manifest per agent identity is worth the overhead if you're building anything production-grade on top of Phi-4-mini.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment