FP8 Quantized support (#21)
Browse files- FP8 Quantized support (e1e295a9b18384283964f7b465315d597d97bc53)
README.md
CHANGED
|
@@ -22,6 +22,7 @@ pipeline_tag: image-text-to-text
|
|
| 22 |
## π’ News & Updates
|
| 23 |
|
| 24 |
- π **Online Demo**: Explore Step3-VL-10B on [Hugging Face Spaces](https://huggingface.co/spaces/stepfun-ai/Step3-VL-10B) !
|
|
|
|
| 25 |
- π’ **[Notice] vLLM Support:** vLLM integration is now officially supported! (PR [#32329](https://github.com/vllm-project/vllm/pull/32329))
|
| 26 |
- β
**[Fixed] HF Inference:** Resolved the `eos_token_id` misconfiguration in `config.json` that caused infinite generation loops. (PR [#abdf3](https://huggingface.co/stepfun-ai/Step3-VL-10B/commit/abdf3618e914a9e3de0ad74efacc8b7a10f06c10))
|
| 27 |
- β
**[Fixing] Metric Correction:** We sincerely apologize for inaccuracies in the Qwen3VL-8B benchmarks (e.g., AIME, HMMT, LCB). The errors were caused by an incorrect max_tokens setting (mistakenly set to 32k) during our large-scale evaluation process. We are re-running the tests and will provide corrected numbers in the next version of technical report.
|
|
@@ -46,6 +47,7 @@ The success of STEP3-VL-10B is driven by two key strategic designs:
|
|
| 46 |
| :-------------------- | :--- | :----------------------------------------------------------------: | :----------------------------------------------------------------------: |
|
| 47 |
| **STEP3-VL-10B-Base** | Base | [π€ Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-Base) | [π€ Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-Base) |
|
| 48 |
| **STEP3-VL-10B** | Chat | [π€ Download](https://huggingface.co/stepfun-ai/Step3-VL-10B) | [π€ Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B) |
|
|
|
|
| 49 |
|
| 50 |
## π Performance
|
| 51 |
|
|
|
|
| 22 |
## π’ News & Updates
|
| 23 |
|
| 24 |
- π **Online Demo**: Explore Step3-VL-10B on [Hugging Face Spaces](https://huggingface.co/spaces/stepfun-ai/Step3-VL-10B) !
|
| 25 |
+
- π’ **[Notice] FP8 Quantization Support :** FP8 quantized weights are now available. ([Download link](https://huggingface.co/stepfun-ai/Step3-VL-10B-FP8))
|
| 26 |
- π’ **[Notice] vLLM Support:** vLLM integration is now officially supported! (PR [#32329](https://github.com/vllm-project/vllm/pull/32329))
|
| 27 |
- β
**[Fixed] HF Inference:** Resolved the `eos_token_id` misconfiguration in `config.json` that caused infinite generation loops. (PR [#abdf3](https://huggingface.co/stepfun-ai/Step3-VL-10B/commit/abdf3618e914a9e3de0ad74efacc8b7a10f06c10))
|
| 28 |
- β
**[Fixing] Metric Correction:** We sincerely apologize for inaccuracies in the Qwen3VL-8B benchmarks (e.g., AIME, HMMT, LCB). The errors were caused by an incorrect max_tokens setting (mistakenly set to 32k) during our large-scale evaluation process. We are re-running the tests and will provide corrected numbers in the next version of technical report.
|
|
|
|
| 47 |
| :-------------------- | :--- | :----------------------------------------------------------------: | :----------------------------------------------------------------------: |
|
| 48 |
| **STEP3-VL-10B-Base** | Base | [π€ Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-Base) | [π€ Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-Base) |
|
| 49 |
| **STEP3-VL-10B** | Chat | [π€ Download](https://huggingface.co/stepfun-ai/Step3-VL-10B) | [π€ Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B) |
|
| 50 |
+
| **STEP3-VL-10B-FP8** | Quantized | [π€ Download](https://huggingface.co/stepfun-ai/Step3-VL-10B-FP8) | [π€ Download](https://modelscope.cn/models/stepfun-ai/Step3-VL-10B-FP8) |
|
| 51 |
|
| 52 |
## π Performance
|
| 53 |
|