Tingdan commited on
Commit
5026053
Β·
1 Parent(s): 821317c

FP8 Quantized support (#21)

Browse files

- FP8 Quantized support (e1e295a9b18384283964f7b465315d597d97bc53)

Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -22,6 +22,7 @@ pipeline_tag: image-text-to-text
22
  ## πŸ“’ News & Updates
23
 
24
  - πŸš€ **Online Demo**: Explore Step3-VL-10B on [Hugging Face Spaces](https://huggingface.co/spaces/stepfun-ai/Step3-VL-10B) !
 
25
  - πŸ“’ **[Notice] vLLM Support:** vLLM integration is now officially supported! (PR [#32329](https://github.com/vllm-project/vllm/pull/32329))
26
  - βœ… **[Fixed] HF Inference:** Resolved the `eos_token_id` misconfiguration in `config.json` that caused infinite generation loops. (PR [#abdf3](https://huggingface.co/stepfun-ai/Step3-VL-10B/commit/abdf3618e914a9e3de0ad74efacc8b7a10f06c10))
27
  - βœ… **[Fixing] Metric Correction:** We sincerely apologize for inaccuracies in the Qwen3VL-8B benchmarks (e.g., AIME, HMMT, LCB). The errors were caused by an incorrect max_tokens setting (mistakenly set to 32k) during our large-scale evaluation process. We are re-running the tests and will provide corrected numbers in the next version of technical report.
@@ -46,6 +47,7 @@ The success of STEP3-VL-10B is driven by two key strategic designs:
46
  | :-------------------- | :--- | :----------------------------------------------------------------: | :----------------------------------------------------------------------: |
47
  | **STEP3-VL-10B-Base** | Base | [πŸ€— Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-Base) | [πŸ€– Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-Base) |
48
  | **STEP3-VL-10B** | Chat | [πŸ€— Download](https://huggingface.co/stepfun-ai/Step3-VL-10B) | [πŸ€– Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B) |
 
49
 
50
  ## πŸ“Š Performance
51
 
 
22
  ## πŸ“’ News & Updates
23
 
24
  - πŸš€ **Online Demo**: Explore Step3-VL-10B on [Hugging Face Spaces](https://huggingface.co/spaces/stepfun-ai/Step3-VL-10B) !
25
+ - πŸ“’ **[Notice] FP8 Quantization Support :** FP8 quantized weights are now available. ([Download link](https://huggingface.co/stepfun-ai/Step3-VL-10B-FP8))
26
  - πŸ“’ **[Notice] vLLM Support:** vLLM integration is now officially supported! (PR [#32329](https://github.com/vllm-project/vllm/pull/32329))
27
  - βœ… **[Fixed] HF Inference:** Resolved the `eos_token_id` misconfiguration in `config.json` that caused infinite generation loops. (PR [#abdf3](https://huggingface.co/stepfun-ai/Step3-VL-10B/commit/abdf3618e914a9e3de0ad74efacc8b7a10f06c10))
28
  - βœ… **[Fixing] Metric Correction:** We sincerely apologize for inaccuracies in the Qwen3VL-8B benchmarks (e.g., AIME, HMMT, LCB). The errors were caused by an incorrect max_tokens setting (mistakenly set to 32k) during our large-scale evaluation process. We are re-running the tests and will provide corrected numbers in the next version of technical report.
 
47
  | :-------------------- | :--- | :----------------------------------------------------------------: | :----------------------------------------------------------------------: |
48
  | **STEP3-VL-10B-Base** | Base | [πŸ€— Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-Base) | [πŸ€– Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-Base) |
49
  | **STEP3-VL-10B** | Chat | [πŸ€— Download](https://huggingface.co/stepfun-ai/Step3-VL-10B) | [πŸ€– Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B) |
50
+ | **STEP3-VL-10B-FP8** | Quantized | [πŸ€— Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-FP8) | [πŸ€– Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-FP8) |
51
 
52
  ## πŸ“Š Performance
53