OpenAI API Compatibility

2024-10-07 1 minute read

设置 API Key

export LITELLM_API_KEY=sk-1234

服务端口

Ollama: 11434
LiteLLM: 4000
XInference: 9997
MindIE: 1025

models

Ollama

curl -s http://localhost:11434/v1/models \
    | jq -r '.data[].id'

curl -s: -s 选项表示静默模式，不输出进度信息。
jq -r: -r 选项表示以原始格式输出，去掉了引号。

LiteLLM

curl -s http://localhost:4000/v1/models \
    -H "Authorization: Bearer $LITELLM_API_KEY" \
    | jq -r '.data[].id'

在 Bash 中，单引号和双引号的使用有一些重要的区别：

单引号 (‘)
- 完全字面值：单引号内的内容被视为字面值，不会对其中的任何字符进行扩展或解析。
- 变量不扩展：在单引号内，变量不会被解析。例如，’$LITELLM_API_KEY’ 会被视为字符串 ‘$LITELLM_API_KEY’，而不是变量的值。
```
  echo '$LITELLM_API_KEY'  # 输出: $LITELLM_API_KEY
```
双引号 (“)
- 允许扩展：双引号内的内容会进行解析和扩展。
- 变量扩展：在双引号内，变量会被解析。例如，”$LITELLM_API_KEY” 会被替换为该变量的值。
- 特殊字符处理：某些字符（如 $ 和 \）在双引号内需要用反斜杠转义。
```
  echo "$LITELLM_API_KEY"  # 输出: 变量的实际值(例如: sk-1234)
```

XInference

curl -s http://localhost:9997/v1/models \
    | jq -r '.data[].id'

Completions

Ollama

curl 'http://localhost:11434/v1/completions' \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "prompt": "你是谁？"
    }'

LiteLLM

curl 'http://localhost:4000/v1/completions' \
    -H "Authorization: Bearer $LITELLM_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "prompt": "你是谁？"
    }'

XInference

curl 'http://localhost:9997/v1/completions' \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "prompt": "你是谁？"
    }'

Chat Completions

Ollama

curl 'http://localhost:11434/v1/chat/completions' \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [ 
            { "role": "system", "content": "你是位人工智能专家。" }, 
            { "role": "user", "content": "解释人工智能" } 
        ]
    }'

LiteLLM

curl 'http://localhost:4000/v1/chat/completions' \
    -H "Authorization: Bearer $LITELLM_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [ 
            { "role": "system", "content": "你是位人工智能专家。" }, 
            { "role": "user", "content": "解释人工智能" } 
        ]
    }'

XInference

curl 'http://localhost:9997/v1/chat/completions' \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [ 
            { "role": "system", "content": "你是位人工智能专家。" }, 
            { "role": "user", "content": "解释人工智能" } 
        ]
    }'

MindIE

curl 'http://localhost:1025/v1/chat/completions' \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen",
        "messages": [ 
            { "role": "system", "content": "你是位人工智能专家。" }, 
            { "role": "user", "content": "解释人工智能" } 
        ]
    }'

Embeddings

Ollama

curl http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
        "model": "bge-m3",
        "input": "Embedding text..."
    }'

LiteLLM

curl http://localhost:4000/v1/embeddings \
    -H "Authorization: Bearer $LITELLM_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "bge-m3",
        "input": "Embedding text..."
    }'

XInference

curl http://localhost:9997/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
        "model": "bge-m3",
        "input": "Embedding text..."
    }'

设置 API Key

服务端口

models

Ollama

LiteLLM

XInference

Completions

Ollama

LiteLLM

XInference

Chat Completions

Ollama

LiteLLM

XInference

MindIE

Embeddings

Ollama

LiteLLM

XInference

参考资料