ImportError: cannot import name 'LossKwargs' from 'transformers.utils'
Has anyone encountered the error ImportError: cannot import name 'LossKwargs' from 'transformers.utils' for this line https://huggingface.co/microsoft/Phi-4-mini-instruct/blob/main/modeling_phi3.py#L38 ?
This happens while trying to generate dummy LoRa ckpts for this model with:
from peft import LoraConfig, get_peft_model
# [...]
lora_config = LoraConfig(r=lora_rank,
target_modules=target_modules,
bias="none",
task_type="CAUSAL_LM")
lora_output_paths = []
for lora_idx in range(num_loras):
lora_model = get_peft_model(model, lora_config)
I tried installing multiple transformers versions, including the 4.45 that was specified in the config json:
llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "--no-cache-dir", "transformers"]) # 4.55.2
llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "git+https://github.com/huggingface/transformers.git"]) # 4.56.0.dev0
llm_venv.run_cmd(["-m", "pip", "install", "--force-reinstall", "--no-cache-dir", "transformers==4.45.0"]) # 4.45.0
But the issue persists.
same here impossible to import module via transformers..
try downgrading trl and transformers
pip install transformers==4.53.3
pip install trl==0.20.0
I think LossKwargs removed from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838
i have same problem here
.
try downgrading trl and transformers
pip install transformers==4.53.3 pip install trl==0.20.0I think
LossKwargsremoved from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838
Unfortunately, that didn't help on my end
That's worded for me:
π§ Step-by-Step Fix
Clear the Hugging Face cache
Sometimes an old copy ofmodeling_phi3.pywith the bad import is cached:rm -rf ~/.cache/huggingface/hub/models--microsoft--Phi-4-mini-instruct(On Windows, delete the folder:
%USERPROFILE%\.cache\huggingface\hub\models--microsoft--Phi-4-mini-instruct).Try loading without remote code
If yourtransformersversion already supports Phi models:from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( "microsoft/Phi-4-mini-instruct", trust_remote_code=False )β This bypasses the buggy
modeling_phi3.py.If you still need
trust_remote_code=TrueDownload the model locally (
snapshot_downloador git clone).Edit
modeling_phi3.pyand replace:from transformers.utils import LossKwargswith:
try: from transformers.utils import LossKwargs except ImportError: from transformers.loss.loss_utils import LossKwargsLoad from the local path:
model = AutoModelForCausalLM.from_pretrained( "./Phi-4-mini-instruct", trust_remote_code=True )
Ensure a compatible
transformersversion
The model repo may have been written for a specific release. Pin it explicitly:pip install "transformers==4.45.0"or with uv:
uv add transformers==4.45.0 uv sync
β
Quickest fix: clear the cache + try with trust_remote_code=False.
π οΈ If you must use remote code: patch the import in modeling_phi3.py.
Do you want me to also give you a ready one-liner PowerShell command to clear the Hugging Face cache on Windows?
transformers version 4.50.0 worked for me
try downgrading trl and transformers
pip install transformers==4.53.3 pip install trl==0.20.0I think
LossKwargsremoved from transformers==4.54.0
https://github.com/sgl-project/sglang/issues/8004#issuecomment-3148397838Unfortunately, that didn't help on my end
transformers==4.53.3 this version works for me! Thank you!
This is a known compatibility issue that comes up when your installed version of transformers doesn't match what microsoft/Phi-4-mini-instruct expects. LossKwargs was introduced in transformers 4.46.0 as part of a broader refactor to how loss computation kwargs are handled in modeling classes. If you're on an older version, you'll hit this import error immediately.
The fix is straightforward: pip install --upgrade transformers should resolve it in most cases. If you're in a constrained environment (e.g., a Docker image with pinned deps), you want at minimum transformers>=4.46.0. Also worth checking that your accelerate and torch versions are compatible β Phi-4-mini-instruct's modeling code uses some of the newer attention and quantization paths that can expose version mismatches in the full dependency chain. Running pip install transformers accelerate torch --upgrade together is usually the safest path.
One thing worth noting if you're deploying Phi-4-mini-instruct inside an agentic pipeline: version mismatches like this become harder to debug when the model is being loaded dynamically by an orchestrator rather than directly. We've run into similar issues in AgentGraph when agents are provisioned with different runtime environments and there's no consistent verification of the dependency state at load time. Having some form of environment fingerprinting tied to the agent's identity record helps catch these before they surface as cryptic import errors mid-execution. But for the immediate issue β upgrade transformers and you should be unblocked.
This is a version compatibility issue between your installed transformers library and what microsoft/Phi-4-mini-instruct expects. LossKwargs was introduced in transformers 4.46.0 as part of a refactor to how loss computation kwargs are handled in modeling code. If you're on an older version, that symbol simply doesn't exist in transformers.utils.
The fix is straightforward: pip install --upgrade transformers should get you to a version where LossKwargs is present. If you're in a constrained environment and can't upgrade, the workaround is to pin to a compatible version β check the model card or the config.json for any explicit transformers_version field, which sometimes hints at the minimum required version. For Phi-4-mini-instruct specifically, you likely need at least 4.46.x, possibly 4.47+ depending on when your local clone of the repo was pulled.
One thing worth noting if you're running this model inside an agent pipeline: dependency drift like this is actually a subtle trust surface. In multi-agent systems where different agents load different model backends, a version mismatch can cause silent failures or inconsistent behavior that's hard to attribute. At AgentGraph we've seen this come up when orchestrating heterogeneous model endpoints β the agent reports a capability it technically can't exercise because the underlying library is mismatched. Keeping a verified dependency manifest per agent identity is worth the overhead if you're building anything production-grade on top of Phi-4-mini.