OpenAI API Documentation Chat Completion

2023-04-28 3 minute read

Chat Completion

模型

gpt-3.5-turbo
gpt-4

可以做很多事情

起草电子邮件或其他书面文件
编写 Python 代码
回答有关一组文件的问题
创建会话代理
为您的软件提供自然语言界面
一系列科目的导师
翻译语言
模拟视频游戏中的角色等等

API 调用

例子

import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

message = response["choices"][0]["message"]["content"]
print(message)

The 2020 World Series was played at Globe Life Field in Arlington, Texas.

主要输入是消息参数。消息必须是一个消息对象数组，其中每个对象都有一个角色（”system”、”user”、”assistant”）和内容（”content”）。

通常，对话首先使用 “system” 消息进行格式化，然后是交替的 “user” 和 “assistant” 消息。

“system” 消息有助于设置 “assistant” 的行为。

响应格式

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas due to the COVID-19 pandemic.",
        "role": "assistant"
      }
    }
  ],
  "created": 1682650205,
  "id": "chatcmpl-7A8TNttrmsx6mCt6CVDfymOhwVaEJ",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 24,
    "prompt_tokens": 57,
    "total_tokens": 81
  }
}

Token

介绍

语言模型以称为 token 的块形式读取文本。在英语中，token 可以短至一个字符，也可以长至一个单词（例如，a或 apple），在某些语言中，token 甚至可以短于一个字符，甚至长于一个单词。

例如，字符串 "ChatGPT is great!" 被编码为六个 token: ["Chat", "G", "PT", " is", " great", "!"].

API 调用中的 token 总数会影响：

您为每个 token 支付的 API 调用费用是多少您的 API 调用需要多长时间，因为写入更多 token 需要更多时间您的 API 调用是否有效，因为 token 总数必须低于模型的最大限制（4096 个令牌gpt-3.5-turbo-0301）输入和输出 token 都计入这些数量。

要查看 API 调用使用了多少 token ，请检查 API 响应中的 usage 字段（例如，response[‘usage’][‘total_tokens’]）。

计算 API 调用消息的 Token

下面是官方提供的计算方法

# pip install tiktoken
import tiktoken

def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301"):
    """Returns the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using cl100k_base encoding.")
        encoding = tiktoken.get_encoding("cl100k_base")
    if model == "gpt-3.5-turbo":
        print("Warning: gpt-3.5-turbo may change over time. Returning num tokens assuming gpt-3.5-turbo-0301.")
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301")
    elif model == "gpt-4":
        print("Warning: gpt-4 may change over time. Returning num tokens assuming gpt-4-0314.")
        return num_tokens_from_messages(messages, model="gpt-4-0314")
    elif model == "gpt-3.5-turbo-0301":
        tokens_per_message = 4  # every message follows <|start|>{role/name}\n{content}<|end|>\n
        tokens_per_name = -1  # if there's a name, the role is omitted
    elif model == "gpt-4-0314":
        tokens_per_message = 3
        tokens_per_name = 1
    else:
        raise NotImplementedError(f"""num_tokens_from_messages() is not implemented for model {model}. See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.""")
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>
    return num_tokens

例子

messages = [
  {"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
  {"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
  {"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
  {"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
  {"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
  {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."}
]

model = "gpt-3.5-turbo"
print(f"{num_tokens_from_messages(messages, model)} prompt tokens counted.")
# 127

调用 API 返回的 prompt token 数量

openai.ChatCompletion.create(model=model, messages=messages, temperature=0, max_tokens=1)['usage']['prompt_tokens']
# 127

中文翻译英文的机器人

class Conversation:
    def __init__(self, prompt):
        self.prompt = prompt
        self.messages = [{"role": "system", "content": self.prompt}]
    
    def ask(self, question):
        self.messages.append({"role": "user", "content": question})

        try:
            response = openai.ChatCompletion.create(
                model="gpt-3.5-turbo",
                messages=self.messages
            )
        except Exception as e:
            return e
        
        assistant_message = response["choices"][0]["message"]["content"]
        self.messages.append({"role": "assistant", "content": assistant_message})

        return assistant_message

def translate(conv, input):
    output = conv.ask(input)
    print(f"User: {input}")
    print(f"Assistant: {output}")
    print()

conv = Conversation("您是个翻译员，帮助用户把中文翻译成英文。")

translate(conv, "你好！")
translate(conv, "你在哪里？")
translate(conv, "你在干什么？")

User: 你好！
Assistant: Hello!

User: 你在哪里？
Assistant: Where are you located?

User: 你在干什么？
Assistant: What are you doing?

指导聊天模型

许多对话以系统消息开始，以温和地指示助手。

这是用于 ChatGPT 的系统消息。

{"role": "system", "content": "您是个翻译员，帮助用户把中文翻译成英文。"}

如果模型没有生成您想要的输出，请随意迭代并尝试潜在的改进。您可以尝试以下方法：

让您的指示更明确
指定您想要答案的格式
在确定答案之前让模型逐步思考或讨论利弊

除了系统消息之外，temperature 和 max tokens 是开发人员必须影响聊天模型输出的众多选项中的两个。对于 temperature，较高的值（如 0.8）将使输出更加随机，而较低的值（如 0.2）将使输出更加集中和确定。在 max tokens 的情况下，如果要将响应限制为特定长度，可以将 max tokens 设置为任意数字。这可能会导致问题，例如，如果您将最大标记值设置为 5，因为输出将被切断并且结果对用户没有意义。

请求速率限制

如果您不是 Plus 用户，使用 gpt-3.5-turbo 模型，每分钟限制调用 3 次。

Assistant: Rate limit reached for default-gpt-3.5-turbo in organization org-cv3W9eqOn6jmTih3dYtjFUEg on requests per min. Limit: 3 / min. Please try again in 20s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.