Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,7 @@ tags:
|
|
| 14 |
- chat
|
| 15 |
- conversational
|
| 16 |
- reasoning
|
|
|
|
| 17 |
inference:
|
| 18 |
parameters:
|
| 19 |
temperature: 0
|
|
@@ -24,6 +25,18 @@ widget:
|
|
| 24 |
library_name: transformers
|
| 25 |
---
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
# Phi-4-reasoning-plus Model Card
|
| 28 |
|
| 29 |
[Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)
|
|
|
|
| 14 |
- chat
|
| 15 |
- conversational
|
| 16 |
- reasoning
|
| 17 |
+
- vllm
|
| 18 |
inference:
|
| 19 |
parameters:
|
| 20 |
temperature: 0
|
|
|
|
| 25 |
library_name: transformers
|
| 26 |
---
|
| 27 |
|
| 28 |
+
# Sharded weights checkpoints
|
| 29 |
+
|
| 30 |
+
This is derived directly from [`save_sharded_state.py`](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/save_sharded_state.py) to be used with vLLM with `-tp=2`:
|
| 31 |
+
|
| 32 |
+
```bash
|
| 33 |
+
vllm serve aarnphm/phi-4-reasoning-plus-sharded-tp2 \
|
| 34 |
+
-tp=2 \
|
| 35 |
+
--load-format sharded_state
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
# Phi-4-reasoning-plus Model Card
|
| 41 |
|
| 42 |
[Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)
|