Instructions to use shivs28/jee_nujan_mix_v2_base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use shivs28/jee_nujan_mix_v2_base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="shivs28/jee_nujan_mix_v2_base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("shivs28/jee_nujan_mix_v2_base") model = AutoModelForCausalLM.from_pretrained("shivs28/jee_nujan_mix_v2_base") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use shivs28/jee_nujan_mix_v2_base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "shivs28/jee_nujan_mix_v2_base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shivs28/jee_nujan_mix_v2_base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/shivs28/jee_nujan_mix_v2_base
- SGLang
How to use shivs28/jee_nujan_mix_v2_base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "shivs28/jee_nujan_mix_v2_base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shivs28/jee_nujan_mix_v2_base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "shivs28/jee_nujan_mix_v2_base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shivs28/jee_nujan_mix_v2_base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use shivs28/jee_nujan_mix_v2_base with Docker Model Runner:
docker model run hf.co/shivs28/jee_nujan_mix_v2_base
JEE NUJAN MIX V2 - Base Merged Model
Model Description
This is the base merged model for JEE mathematics problem solving, created by combining three specialized models using linear interpolation. This model serves as the foundation for further fine-tuning on mathematical datasets.
Model Architecture
Merged Models:
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B (40% weight) - Advanced reasoning capabilities
- Qwen/Qwen2.5-Math-1.5B (35% weight) - Mathematical problem solving
- microsoft/phi-2 (25% weight) - General reasoning and language understanding
Merge Method: Linear interpolation with weight normalization Output Format: Float16 for efficiency Tokenizer: Based on DeepSeek-R1-Distill-Qwen-1.5B
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load the merged base model
tokenizer = AutoTokenizer.from_pretrained("shivs28/jee_nujan_mix_v2_base")
model = AutoModelForCausalLM.from_pretrained(
"shivs28/jee_nujan_mix_v2_base",
torch_dtype=torch.float16,
device_map="auto"
)
# Use for mathematical reasoning
prompt = "Solve: What is the derivative of x^2 + 3x + 1?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Intended Use
This base model is intended to be:
- Fine-tuned on specific mathematical datasets for enhanced performance
- Used as starting point for educational AI applications
- Evaluated for mathematical reasoning capabilities
Next Steps
This base model will be fine-tuned on comprehensive mathematical datasets including:
- Competition Mathematics (MATH dataset)
- GSM8K word problems
- MathQA reasoning problems
- AQuA-RAT algebraic problems
- Custom JEE advanced problems
Model Card Authors
Created by the JEE NUJAN MIX team for educational purposes.
Citation
Please cite the original base models:
- DeepSeek-R1-Distill-Qwen-1.5B
- Qwen2.5-Math-1.5B
- Phi-2
tags:
- open-llm-leaderboard
- math
- gsm8k
- casual-lm
- fine-tuned
This model is part of the NUJAN educational initiative.
- Downloads last month
- 1