OpenAI API Documentation Chat Completion
类别: OpenAI-API-Chat-Completion 标签: OpenAI API Chat Completion Token目录
Chat Completion
模型
- gpt-3.5-turbo
- gpt-4
可以做很多事情
- 起草电子邮件或其他书面文件
- 编写 Python 代码
- 回答有关一组文件的问题
- 创建会话代理
- 为您的软件提供自然语言界面
- 一系列科目的导师
- 翻译语言
- 模拟视频游戏中的角色等等
API 调用
例子
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
message = response["choices"][0]["message"]["content"]
print(message)
The 2020 World Series was played at Globe Life Field in Arlington, Texas.
主要输入是消息参数。消息必须是一个消息对象数组,其中每个对象都有一个角色(”system”、”user”、”assistant”)和内容(”content”)。
通常,对话首先使用 “system” 消息进行格式化,然后是交替的 “user” 和 “assistant” 消息。
“system” 消息有助于设置 “assistant” 的行为。
响应格式
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas due to the COVID-19 pandemic.",
"role": "assistant"
}
}
],
"created": 1682650205,
"id": "chatcmpl-7A8TNttrmsx6mCt6CVDfymOhwVaEJ",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion",
"usage": {
"completion_tokens": 24,
"prompt_tokens": 57,
"total_tokens": 81
}
}
Token
介绍
语言模型以称为 token 的块形式读取文本。在英语中,token 可以短至一个字符,也可以长至一个单词(例如,a或 apple),在某些语言中,token 甚至可以短于一个字符,甚至长于一个单词。
例如,字符串 "ChatGPT is great!"
被编码为六个 token: ["Chat", "G", "PT", " is", " great", "!"]
.
API 调用中的 token 总数会影响:
您为每个 token 支付的 API 调用费用是多少 您的 API 调用需要多长时间,因为写入更多 token 需要更多时间 您的 API 调用是否有效,因为 token 总数必须低于模型的最大限制(4096 个令牌gpt-3.5-turbo-0301) 输入和输出 token 都计入这些数量。
要查看 API 调用使用了多少 token ,请检查 API 响应中的 usage 字段(例如,response[‘usage’][‘total_tokens’])。
计算 API 调用消息的 Token
下面是官方提供的计算方法
# pip install tiktoken
import tiktoken
def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301"):
"""Returns the number of tokens used by a list of messages."""
try:
encoding = tiktoken.encoding_for_model(model)
except KeyError:
print("Warning: model not found. Using cl100k_base encoding.")
encoding = tiktoken.get_encoding("cl100k_base")
if model == "gpt-3.5-turbo":
print("Warning: gpt-3.5-turbo may change over time. Returning num tokens assuming gpt-3.5-turbo-0301.")
return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301")
elif model == "gpt-4":
print("Warning: gpt-4 may change over time. Returning num tokens assuming gpt-4-0314.")
return num_tokens_from_messages(messages, model="gpt-4-0314")
elif model == "gpt-3.5-turbo-0301":
tokens_per_message = 4 # every message follows <|start|>{role/name}\n{content}<|end|>\n
tokens_per_name = -1 # if there's a name, the role is omitted
elif model == "gpt-4-0314":
tokens_per_message = 3
tokens_per_name = 1
else:
raise NotImplementedError(f"""num_tokens_from_messages() is not implemented for model {model}. See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.""")
num_tokens = 0
for message in messages:
num_tokens += tokens_per_message
for key, value in message.items():
num_tokens += len(encoding.encode(value))
if key == "name":
num_tokens += tokens_per_name
num_tokens += 3 # every reply is primed with <|start|>assistant<|message|>
return num_tokens
例子
messages = [
{"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
{"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
{"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
{"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
{"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
{"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."}
]
model = "gpt-3.5-turbo"
print(f"{num_tokens_from_messages(messages, model)} prompt tokens counted.")
# 127
调用 API 返回的 prompt token 数量
openai.ChatCompletion.create(model=model, messages=messages, temperature=0, max_tokens=1)['usage']['prompt_tokens']
# 127
中文翻译英文的机器人
class Conversation:
def __init__(self, prompt):
self.prompt = prompt
self.messages = [{"role": "system", "content": self.prompt}]
def ask(self, question):
self.messages.append({"role": "user", "content": question})
try:
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=self.messages
)
except Exception as e:
return e
assistant_message = response["choices"][0]["message"]["content"]
self.messages.append({"role": "assistant", "content": assistant_message})
return assistant_message
def translate(conv, input):
output = conv.ask(input)
print(f"User: {input}")
print(f"Assistant: {output}")
print()
conv = Conversation("您是个翻译员,帮助用户把中文翻译成英文。")
translate(conv, "你好!")
translate(conv, "你在哪里?")
translate(conv, "你在干什么?")
User: 你好!
Assistant: Hello!
User: 你在哪里?
Assistant: Where are you located?
User: 你在干什么?
Assistant: What are you doing?
指导聊天模型
许多对话以系统消息开始,以温和地指示助手。
这是用于 ChatGPT 的系统消息。
{"role": "system", "content": "您是个翻译员,帮助用户把中文翻译成英文。"}
如果模型没有生成您想要的输出,请随意迭代并尝试潜在的改进。您可以尝试以下方法:
- 让您的指示更明确
- 指定您想要答案的格式
- 在确定答案之前让模型逐步思考或讨论利弊
除了系统消息之外,temperature 和 max tokens 是开发人员必须影响聊天模型输出的众多选项中的两个。对于 temperature,较高的值(如 0.8)将使输出更加随机,而较低的值(如 0.2)将使输出更加集中和确定。在 max tokens 的情况下,如果要将响应限制为特定长度,可以将 max tokens 设置为任意数字。这可能会导致问题,例如,如果您将最大标记值设置为 5,因为输出将被切断并且结果对用户没有意义。
请求速率限制
如果您不是 Plus 用户,使用 gpt-3.5-turbo 模型,每分钟限制调用 3 次。
Assistant: Rate limit reached for default-gpt-3.5-turbo in organization org-cv3W9eqOn6jmTih3dYtjFUEg on requests per min. Limit: 3 / min. Please try again in 20s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.