Text Generation
Transformers
Safetensors
English
qwen2
RLHF
Nexusflow
Athene
Function Calling
Agent
Extraction
conversational
text-generation-inference
Instructions to use Nexusflow/Athene-V2-Agent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Nexusflow/Athene-V2-Agent with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Nexusflow/Athene-V2-Agent") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Nexusflow/Athene-V2-Agent") model = AutoModelForCausalLM.from_pretrained("Nexusflow/Athene-V2-Agent") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Nexusflow/Athene-V2-Agent with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Nexusflow/Athene-V2-Agent" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/Athene-V2-Agent", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Nexusflow/Athene-V2-Agent
- SGLang
How to use Nexusflow/Athene-V2-Agent with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Nexusflow/Athene-V2-Agent" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/Athene-V2-Agent", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Nexusflow/Athene-V2-Agent" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/Athene-V2-Agent", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Nexusflow/Athene-V2-Agent with Docker Model Runner:
docker model run hf.co/Nexusflow/Athene-V2-Agent
| from dataclasses import dataclass | |
| from typing import List, Dict, Any, Optional | |
| import json | |
| import requests | |
| from bs4 import BeautifulSoup | |
| from openai import OpenAI | |
| """ | |
| EXAMPLE OUTPUT: | |
| What is the current population for the city where Einstein was born? | |
| Step 1 | |
| ---------------------------------------- | |
| Executing: fetch_wiki_content | |
| Arguments: {'title': 'Albert Einstein'} | |
| Step 2 | |
| ---------------------------------------- | |
| Executing: deliver_answer | |
| Arguments: {'fields': ['Ulm, German Empire']} | |
| ANSWER FROM THE ASSISTANT: ['Ulm, German Empire'] | |
| Step 3 | |
| ---------------------------------------- | |
| Executing: fetch_wiki_content | |
| Arguments: {'title': 'Ulm'} | |
| Step 4 | |
| ---------------------------------------- | |
| Executing: deliver_answer | |
| Arguments: {'fields': ['128,928']} | |
| ANSWER FROM THE ASSISTANT: ['128,928'] | |
| Step 5 | |
| ---------------------------------------- | |
| Extraction Complete | |
| Why was Einstein famous? | |
| Step 1 | |
| ---------------------------------------- | |
| Executing: fetch_wiki_content | |
| Arguments: {'title': 'Albert Einstein'} | |
| Step 2 | |
| ---------------------------------------- | |
| Executing: deliver_answer | |
| Arguments: {'fields': ['Best known for developing the theory of relativity, Einstein also made important contributions to quantum mechanics.', 'His mass–energy equivalence formula E = mc2, which arises from special relativity, has been called "the world\'s most famous equation."', 'He received the 1921 Nobel Prize in Physics.']} | |
| ANSWER FROM THE ASSISTANT: ['Best known for developing the theory of relativity, Einstein also made important contributions to quantum mechanics.', 'His mass–energy equivalence formula E = mc2, which arises from special relativity, has been called "the world\'s most famous equation."', 'He received the 1921 Nobel Prize in Physics.'] | |
| Step 3 | |
| ---------------------------------------- | |
| Extraction Complete | |
| """ | |
| class WikiConfig: | |
| """Configuration for OpenAI and Wikipedia settings""" | |
| api_key: str = "sk-123" | |
| api_base: str = "{info}/v1" | |
| model: Optional[str] = None | |
| max_steps: int = 5 | |
| wikipedia_base_url: str = "https://en.wikipedia.org/wiki/" | |
| class WikiTools: | |
| """Collection of Wikipedia and extraction tools""" | |
| def __init__(self, base_url: str): | |
| self.base_url = base_url | |
| def fetch_wiki_content(self, title: str, section: Optional[str] = None) -> str: | |
| """Fetch and clean Wikipedia article content, optionally from a specific section""" | |
| url = f"{self.base_url}{title.replace(' ', '_')}" | |
| response = requests.get(url) | |
| soup = BeautifulSoup(response.content, 'html.parser') | |
| # Remove unwanted sections | |
| for unwanted in soup.find_all(['script', 'style', 'footer', 'header']): | |
| unwanted.decompose() | |
| if section: | |
| # Find specific section if requested | |
| section_tag = soup.find('span', {'id': section}) | |
| if section_tag: | |
| content = section_tag.parent.find_next_siblings() | |
| text = ' '.join(tag.get_text() for tag in content) | |
| else: | |
| return "Section not found" | |
| else: | |
| # Get main content | |
| content = soup.find(id='mw-content-text') | |
| if content: | |
| text = content.get_text() | |
| else: | |
| return "Content not found" | |
| # Clean and normalize text | |
| text = ' '.join(text.split()) | |
| return text[:8000] # Truncate to avoid token limits | |
| def deliver_answer(fields: List[str]) -> Dict[str, Any]: | |
| """Extract specific information from text spans""" | |
| print (f"ANSWER FROM THE ASSISTANT: {fields}") | |
| return { | |
| "extracted_fields": "Provided fields was delivered to the user successfully." | |
| } | |
| class ToolRegistry: | |
| """Registry of available tools and their schemas""" | |
| def __init__(self, wiki_tools: WikiTools): | |
| self.wiki_tools = wiki_tools | |
| def available_functions(self) -> Dict[str, callable]: | |
| return { | |
| "fetch_wiki_content": self.wiki_tools.fetch_wiki_content, | |
| "deliver_answer": self.wiki_tools.deliver_answer | |
| } | |
| def tool_schemas(self) -> List[Dict[str, Any]]: | |
| return [ | |
| { | |
| "type": "function", | |
| "function": { | |
| "name": "fetch_wiki_content", | |
| "description": "Fetch content from a Wikipedia article", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "title": { | |
| "type": "string", | |
| "description": "The title of the Wikipedia article" | |
| }, | |
| "section": { | |
| "type": "string", | |
| "description": "Optional: Specific section ID to fetch", | |
| "optional": True | |
| } | |
| }, | |
| "required": ["title"] | |
| } | |
| } | |
| }, | |
| { | |
| "type": "function", | |
| "function": { | |
| "name": "deliver_answer", | |
| "description": "Extract specific information from the fetched text", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "fields": { | |
| "type": "array", | |
| "items": {"type": "string"}, | |
| "description": "List of text spans from the article that are relevant to the query" | |
| } | |
| }, | |
| "required": ["fields"] | |
| } | |
| } | |
| } | |
| ] | |
| class WikiExtractionAgent: | |
| """Main agent class that handles the extraction process""" | |
| def __init__(self, config: WikiConfig): | |
| self.config = config | |
| self.client = OpenAI(api_key=config.api_key, base_url=config.api_base) | |
| self.wiki_tools = WikiTools(config.wikipedia_base_url) | |
| self.tools = ToolRegistry(self.wiki_tools) | |
| self.messages = [{"system" : "1. First fetch any wikipedia pages you might need to answer the user query. Do not answer from parametric knowledge.\n\n2.Then, provide the answer to the user using the deliver_answer from the retrieved wikipedia page.\n\n3. You may need to issue multiple calls to wikipedia after extracting answers if there are nested dependencies for information."}] | |
| if not config.model: | |
| models = self.client.models.list() | |
| self.config.model = models.data[0].id | |
| def _serialize_tool_call(self, tool_call) -> Dict[str, Any]: | |
| """Convert tool call to serializable format""" | |
| return { | |
| "id": tool_call.id, | |
| "type": tool_call.type, | |
| "function": { | |
| "name": tool_call.function.name, | |
| "arguments": tool_call.function.arguments | |
| } | |
| } | |
| def process_tool_calls(self, message) -> List[Dict[str, Any]]: | |
| """Process and execute tool calls from assistant""" | |
| results = [] | |
| for tool_call in message.tool_calls: | |
| function_name = tool_call.function.name | |
| function_args = json.loads(tool_call.function.arguments) | |
| print(f"\nExecuting: {function_name}") | |
| print(f"Arguments: {function_args}") | |
| function_response = self.tools.available_functions[function_name](**function_args) | |
| results.append({ | |
| "tool": function_name, | |
| "args": function_args, | |
| "response": function_response | |
| }) | |
| self.messages.append({ | |
| "role": "tool", | |
| "content": json.dumps(function_response), | |
| "tool_call_id": tool_call.id, | |
| "name": function_name | |
| }) | |
| return results | |
| def extract_information(self, query: str) -> List[Dict[str, Any]]: | |
| """Main method to handle the extraction process""" | |
| self.messages = [{ | |
| "role": "user", | |
| "content": f"""Extract information from Wikipedia to answer this query: {query} | |
| You can use these tools: | |
| 1. fetch_wiki_content: Get article content | |
| 2. deliver_answer: deliver relevant information | |
| Please fetch content first, and iterate as needed to get to the webpage with the correct answer and then deliver the relevant information.""" | |
| }] | |
| all_results = [] | |
| for step in range(self.config.max_steps): | |
| print(f"\nStep {step + 1}") | |
| print("-" * 40) | |
| response = self.client.chat.completions.create( | |
| messages=self.messages, | |
| model=self.config.model, | |
| tools=self.tools.tool_schemas, | |
| temperature=0.0, | |
| ) | |
| message = response.choices[0].message | |
| if not message.tool_calls: | |
| print("Extraction Complete") | |
| break | |
| self.messages.append({ | |
| "role": "assistant", | |
| "content": json.dumps(message.content), | |
| "tool_calls": [self._serialize_tool_call(tc) for tc in message.tool_calls] | |
| }) | |
| results = self.process_tool_calls(message) | |
| all_results.extend(results) | |
| return all_results | |
| def main(): | |
| # Example usage | |
| config = WikiConfig() | |
| agent = WikiExtractionAgent(config) | |
| # Multi-step query example | |
| # The model should first issue a call to wikipedia for Einstein, extract the part from the document about where he was born | |
| # and use the value from that extraction (which could contain the city name) to call another wikipedia article for the city | |
| # and pull the population from it. | |
| # See lines 11 to 41 for the full trace of this actual query that Athene-V2-Agent issues. | |
| results = agent.extract_information( | |
| query="""What is the current population for the city where Einstein was born?""" | |
| ) | |
| # Single query example | |
| # Here, the model should just issue a call to Einstein's wikipedia page, and extract the parts regarding his | |
| # accomplishment. | |
| results = agent.extract_information( | |
| query="Why was Einstein famous?" | |
| ) | |
| if __name__ == "__main__": | |
| main() | |