Feature Support#
Feature |
Supported |
Note |
|---|---|---|
Chunked Prefill |
✗ |
Plan in 2025 Q1 |
Automatic Prefix Caching |
✅ |
Improve performance in 2025 Q1 |
LoRA |
✗ |
Plan in 2025 Q1 |
Prompt adapter |
✅ |
|
Speculative decoding |
✅ |
Improve accuracy in 2025 Q1 |
Pooling |
✗ |
Plan in 2025 Q1 |
Enc-dec |
✗ |
Plan in 2025 Q1 |
Multi Modality |
✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL) |
Add more model support in 2025 Q1 |
LogProbs |
✅ |
|
Prompt logProbs |
✅ |
|
Async output |
✅ |
|
Multi step scheduler |
✅ |
|
Best of |
✅ |
|
Beam search |
✅ |
|
Guided Decoding |
✗ |
Plan in 2025 Q1 |