ai-proxy
Description#
The ai-proxy plugin simplifies access to LLM providers and models by defining a standard request format
that allows key fields in plugin configuration to be embedded into the request.
Proxying requests to OpenAI is supported now. Other LLM services will be supported soon.
Request Format#
OpenAI#
- Chat API
| Name | Type | Required | Description |
|---|---|---|---|
messages | Array | Yes | An array of message objects |
messages.role | String | Yes | Role of the message (system, user, assistant) |
messages.content | String | Yes | Content of the message |
Plugin Attributes#
| Field | Required | Type | Description |
|---|---|---|---|
| auth | Yes | Object | Authentication configuration |
| auth.header | No | Object | Authentication headers. Key must match pattern ^[a-zA-Z0-9._-]+$. |
| auth.query | No | Object | Authentication query parameters. Key must match pattern ^[a-zA-Z0-9._-]+$. |
| model.provider | Yes | String | Name of the AI service provider (openai). |
| model.name | Yes | String | Model name to execute. |
| model.options | No | Object | Key/value settings for the model |
| override.endpoint | No | String | Override the endpoint of the AI provider |
| timeout | No | Integer | Timeout in milliseconds for requests to LLM. Range: 1 - 60000. Default: 30000 |
| keepalive | No | Boolean | Enable keepalive for requests to LLM. Default: true |
| keepalive_timeout | No | Integer | Keepalive timeout in milliseconds for requests to LLM. Minimum: 1000. Default: 60000 |
| keepalive_pool | No | Integer | Keepalive pool size for requests to LLM. Minimum: 1. Default: 30 |
| ssl_verify | No | Boolean | SSL verification for requests to LLM. Default: true |
Example usage#
Create a route with the ai-proxy plugin like so:
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/anything",
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer <some-token>"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"somerandom.com:443": 1
},
"scheme": "https",
"pass_host": "node"
}
}'
Upstream node can be any arbitrary value because it won't be contacted.
Now send a request:
curl http://127.0.0.1:9080/anything -i -XPOST -H 'Content-Type: application/json' -d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "a": 1, "content": "What is 1+1?" }
]
}'
You will receive a response like this:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The sum of \\(1 + 1\\) is \\(2\\).",
"role": "assistant"
}
}
],
"created": 1723777034,
"id": "chatcmpl-9whRKFodKl5sGhOgHIjWltdeB8sr7",
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_abc28019ad",
"usage": { "completion_tokens": 15, "prompt_tokens": 23, "total_tokens": 38 }
}
Send request to an OpenAI compatible LLM#
Create a route with the ai-proxy plugin with provider set to openai-compatible and the endpoint of the model set to override.endpoint like so:
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/anything",
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer <some-token>"
}
},
"model": {
"provider": "openai-compatible",
"name": "qwen-plus"
},
"override": {
"endpoint": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions"
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"somerandom.com:443": 1
},
"scheme": "https",
"pass_host": "node"
}
}'