Feature Support

Feature Support#

Feature

Supported

Note

Chunked Prefill

Plan in 2025 Q1

Automatic Prefix Caching

Improve performance in 2025 Q1

LoRA

Plan in 2025 Q1

Prompt adapter

Speculative decoding

Improve accuracy in 2025 Q1

Pooling

Plan in 2025 Q1

Enc-dec

Plan in 2025 Q1

Multi Modality

✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)

Add more model support in 2025 Q1

LogProbs

Prompt logProbs

Async output

Multi step scheduler

Best of

Beam search

Guided Decoding

Plan in 2025 Q1