Kubernetes Gateway API Inference Extension
Initializing search
kubernetes-sigs/gateway-api-inference-extension
Overview
Guides
Performance
Reference
Enhancements
Contributing
Kubernetes Gateway API Inference Extension
kubernetes-sigs/gateway-api-inference-extension
Overview
Overview
Introduction
Concepts
Concepts
API Overview
Design Principles
Conformance
Roles and Personas
Priority and Capacity
Implementations
Implementations
Gateways
Model Servers
FAQ
Guides
Guides
User Guides
User Guides
Getting started (Released)
Getting started (Latest/Main)
Use Cases
Use Cases
Serving Multiple Inference Pools (Latest/Main)
Deploy As a Standalone Request Scheduler
Flow Control
Rollout
Rollout
Adapter Rollout
InferencePool Rollout
Observability
Observability
Metrics
Traces
Configuration Guide
Configuration Guide
Configuring the EndPoint Picker via configuration YAML file
Prefix Cache Aware Plugin
Resource Tuning
Latency-Based Routing
Migration Guide
Troubleshooting Guide
Implementer Guides
Implementer Guides
Getting started
Conformance Tests
Performance
Performance
Benchmark
Advanced Benchmarking Configs
Advanced Benchmarking Configs
Prefix Cache Aware
Decode Heavy Workload
Prefill Heavy Workload
Regression Testing
Reference
Reference
v1 API Reference
v1alpha1 API Reference
v1alpha1 API Reference
API Reference
v1alpha2 API Reference
v1alpha2 API Reference
API Reference
API Types
API Types
InferencePool
InferenceObjective
InferencePoolImport
InferenceModelRewrite
Enhancements
Enhancements
Overview
Contributing
Contributing
How to Get Involved
404 - Not found
Back to top