Instructions to use Xennon-BD/Doctor-Chad with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Xennon-BD/Doctor-Chad with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Xennon-BD/Doctor-Chad")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Xennon-BD/Doctor-Chad")
model = AutoModelForCausalLM.from_pretrained("Xennon-BD/Doctor-Chad")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Xennon-BD/Doctor-Chad with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Xennon-BD/Doctor-Chad"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Xennon-BD/Doctor-Chad",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Xennon-BD/Doctor-Chad

SGLang

How to use Xennon-BD/Doctor-Chad with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Xennon-BD/Doctor-Chad" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Xennon-BD/Doctor-Chad",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Xennon-BD/Doctor-Chad" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Xennon-BD/Doctor-Chad",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Xennon-BD/Doctor-Chad with Docker Model Runner:
```
docker model run hf.co/Xennon-BD/Doctor-Chad
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

To generate text using the AutoTokenizer and AutoModelForCausalLM from the Hugging Face Transformers library, you can follow these steps. First, ensure you have the necessary libraries installed:

pip install transformers torch

Then, use the following Python code to load the model and generate text:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Xennon-BD/Doctor-Chad")
model = AutoModelForCausalLM.from_pretrained("Xennon-BD/Doctor-Chad")

# Define the input prompt
input_text = "Hello, how are you doing today?"

# Encode the input text
input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate text
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1, do_sample=True)

# Decode the generated text
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(generated_text)

Explanation:

Load the Tokenizer and Model:

tokenizer = AutoTokenizer.from_pretrained("Xennon-BD/Doctor-Chad")
model = AutoModelForCausalLM.from_pretrained("Xennon-BD/Doctor-Chad")

This code loads the tokenizer and model from the specified Hugging Face model repository.

Define the Input Prompt:
```
input_text = "Hello, how are you doing today?"
```
This is the text prompt that you want the model to complete or generate text from.
Encode the Input Text:
```
input_ids = tokenizer.encode(input_text, return_tensors="pt")
```
The tokenizer.encode method converts the input text into token IDs that the model can process. The return_tensors="pt" argument specifies that the output should be in the form of PyTorch tensors.
Generate Text:
```
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1, do_sample=True)
```
The model.generate method generates text based on the input token IDs.
- max_length=50 specifies the maximum length of the generated text.
- num_return_sequences=1 specifies the number of generated text sequences to return.
- do_sample=True indicates that sampling should be used to generate text, which introduces some randomness and can produce more varied text.
Decode the Generated Text:
```
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
```
The tokenizer.decode method converts the generated token IDs back into human-readable text. The skip_special_tokens=True argument ensures that special tokens (like <|endoftext|>) are not included in the output.
Print the Generated Text:
```
print(generated_text)
```
This prints the generated text to the console.

You can modify the input prompt and the parameters of the model.generate method to suit your needs, such as adjusting max_length for longer or shorter text generation, or changing num_return_sequences to generate multiple variations.

Downloads last month: 15

Safetensors

Model size

0.5B params

Tensor type

F16

Model tree for Xennon-BD/Doctor-Chad

Quantizations

1 model

Xennon-BD
/

Doctor-Chad

Explanation:

Model tree for Xennon-BD/Doctor-Chad

Dataset used to train Xennon-BD/Doctor-Chad