Targon API 负载均衡器

API 使用指南

获取API密钥： 请联系管理员 admin@killerbest.com 获取您的专属API密钥

POST /v1/chat/completions

推荐流式支持

与AI模型进行对话交互，支持多轮对话、流式响应和丰富的参数配置

Python 示例

import httpx
import json

# 配置信息
API_BASE = "http://localhost:3811"
API_KEY = "your-api-key-here"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

# 非流式请求
payload = {
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
        {"role": "system", "content": "你是一个有用的AI助手。"},
        {"role": "user", "content": "你好，请介绍一下自己"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000,
    "stream": False
}

response = httpx.post(
    f"{API_BASE}/v1/chat/completions",
    headers=headers, 
    json=payload
)

result = response.json()
print(result['choices'][0]['message']['content'])

流式请求示例

# 流式请求
payload["stream"] = True

with httpx.stream(
    "POST", 
    f"{API_BASE}/v1/chat/completions",
    headers=headers, 
    json=payload
) as response:
    for chunk in response.iter_text():
        if chunk.strip():
            # 处理流式数据
            if chunk.startswith("data: "):
                data = chunk[6:]  # 移除 "data: " 前缀
                if data.strip() != "[DONE]":
                    try:
                        json_data = json.loads(data)
                        content = json_data['choices'][0]['delta'].get('content', '')
                        if content:
                            print(content, end='', flush=True)
                    except json.JSONDecodeError:
                        continue

cURL 示例

# 非流式请求
curl -X POST http://localhost:3811/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
      {"role": "user", "content": "你好！"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

# 流式请求
curl -X POST http://localhost:3811/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
      {"role": "user", "content": "请详细介绍人工智能的发展历程"}
    ],
    "stream": true
  }' \
  --no-buffer

GET /v1/models

基础

获取当前可用的AI模型列表，包含模型ID、状态和基本信息

Python 示例

import httpx

headers = {"Authorization": f"Bearer {API_KEY}"}
response = httpx.get(f"{API_BASE}/v1/models", headers=headers)
models = response.json()

print("可用模型列表:")
print("-" * 50)
for model in models["data"]:
    print(f"模型ID: {model['id']}")
    print(f"创建时间: {model.get('created', 'N/A')}")
    print(f"拥有者: {model.get('owned_by', 'N/A')}")
    print("-" * 50)

cURL 示例

curl -H "Authorization: Bearer your-api-key-here" \
     http://localhost:3811/v1/models | jq .

POST /v1/completions

传统

传统的文本补全API，适用于简单的文本生成和补全任务

Python 示例

payload = {
    "model": "deepseek-ai/DeepSeek-R1",
    "prompt": "人工智能的未来发展趋势是",
    "max_tokens": 200,
    "temperature": 0.7,
    "top_p": 0.9,
    "stream": False
}

response = httpx.post(
    f"{API_BASE}/v1/completions",
    headers=headers, 
    json=payload
)

result = response.json()
print(result['choices'][0]['text'])

参数说明

Chat Completions 参数

参数名称	数据类型	是否必需	说明
model	string	必需	要使用的AI模型标识符，如 "deepseek-ai/DeepSeek-R1"
messages	array	必需	对话消息数组，每个消息包含 role（角色）和 content（内容）
temperature	number	可选	控制响应的随机性，范围 0-2，默认 0.7。较高值产生更随机的输出
max_tokens	integer	可选	生成响应的最大token数量，控制输出长度
stream	boolean	可选	是否启用流式响应，默认 false。启用后可实时接收生成内容
top_p	number	可选	核采样参数，范围 0-1。与 temperature 一起控制输出随机性
frequency_penalty	number	可选	频率惩罚，范围 -2 到 2。正值减少重复词汇的出现
presence_penalty	number	可选	存在惩罚，范围 -2 到 2。正值鼓励模型讨论新话题

支持的模型

推理模型 Reasoning

deepseek-ai/DeepSeek-R1 最新推理模型
deepseek-ai/DeepSeek-R1-0528 R1早期版本
zai-org/GLM-4.5 GLM推理模型
zai-org/GLM-4.5-Air GLM轻量版

对话模型

deepseek-ai/DeepSeek-V3 高性能对话
deepseek-ai/DeepSeek-V3-0324 V3早期版本
moonshotai/Kimi-K2-Instruct Kimi对话模型

编程模型 Tools

Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 专业编程模型
Qwen/Qwen3-235B-A22B-Instruct-2507 通用大模型

模型特性说明：
Reasoning 支持推理链思考
Tools 支持工具调用
Structured 支持结构化输出

上下文长度： 63K - 262K tokens
定价范围： $0.00000012 - $0.00000249 per token

使用 GET /v1/models 接口获取完整的实时可用模型列表和详细信息

错误处理

常见错误码

HTTP状态码	错误类型	错误描述	解决方案
401	Unauthorized	API密钥无效、过期或格式错误	检查 Authorization 头部格式是否为 "Bearer your-api-key"
429	Rate Limited	请求频率超过限制	降低请求频率，实施指数退避重试策略
400	Bad Request	请求参数格式错误或缺失必需参数	检查请求体格式和参数有效性
502	Bad Gateway	上游AI服务暂时不可用	稍后重试或尝试其他模型
503	Service Unavailable	服务临时维护或过载	等待服务恢复或联系管理员

重试策略建议： 遇到 502、503 错误时，请实施指数退避重试（1s、2s、4s、8s...），最多重试3次。 429 错误请根据 Retry-After 头部等待后重试。

实时监控

获取API密钥

API 使用指南

参数说明

Chat Completions 参数

支持的模型

推理模型 Reasoning

对话模型

编程模型 Tools

错误处理

常见错误码