POST /v1/chat/completions

The chat completions endpoint is the primary way to interact with AI models through the gateway. You send a conversation — a list of messages with roles — and the model returns its reply. The endpoint is fully compatible with the OpenAI chat completions API, so any code written for OpenAI works here by changing only the base URL.

Endpoint

POST https://napi.origintask.cn/v1/chat/completions

Request parameters

model

string

必填

The ID of the model to use for this request. Use the GET /v1/models endpoint to retrieve the list of available model IDs.Example: "gpt-4o-mini"

messages

object[]

必填

An array of message objects representing the conversation history. Messages are processed in order.

显示 message properties

role

string

必填

The role of the message author. One of "system", "user", or "assistant".

content

string

必填

The text content of the message.

stream

boolean

默认值:"false"

When true, the response is sent as a stream of server-sent events (SSE) rather than a single JSON object. Each event contains a partial completion delta. The stream ends with a data: [DONE] event.

temperature

number

默认值:"1"

Sampling temperature between 0 and 2. Lower values produce more deterministic output; higher values produce more varied output. Values above 1.5 can lead to incoherent responses.

max_tokens

integer

The maximum number of tokens to generate in the completion. When omitted, the model uses its default context limit. Setting this lower helps control costs.

top_p

number

默认值:"1"

Nucleus sampling parameter between 0 and 1. The model considers only the tokens comprising the top top_p probability mass. Use either temperature or top_p, not both.

Response fields

string

A unique identifier for this completion, prefixed with chatcmpl-.

object

string

Always "chat.completion" for non-streaming responses.

created

integer

Unix timestamp (seconds) of when the completion was created.

model

string

The model ID that was used to generate this completion.

choices

object[]

An array of completion choices. Most requests return a single choice at index 0.

显示 choice properties

index

integer

Zero-based index of this choice in the array.

message

object

The assistant’s reply message.

显示 message properties

role

string

Always "assistant".

content

string

The text of the completion.

finish_reason

string

Why the model stopped generating. One of:

"stop" — the model reached a natural stopping point.
"length" — the max_tokens limit was reached.
"content_filter" — the output was blocked by a content filter.

usage

object

Token usage for this request.

显示 usage properties

prompt_tokens

integer

Number of tokens in the input messages.

completion_tokens

integer

Number of tokens in the generated completion.

total_tokens

integer

Sum of prompt_tokens and completion_tokens.

Examples

curl https://napi.origintask.cn/v1/chat/completions \
  --request POST \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "What is the capital of France?" }
    ]
  }'

Streaming

Set stream: true to receive the response as a series of server-sent events. Each event carries a delta with the next chunk of text. This is useful for displaying output to users in real time.

curl https://napi.origintask.cn/v1/chat/completions \
  --request POST \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "gpt-4o-mini",
    "stream": true,
    "messages": [
      { "role": "user", "content": "Tell me a short joke." }
    ]
  }'

Example response

{
  "id": "chatcmpl-abc123xyz",
  "object": "chat.completion",
  "created": 1744416000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

Overview

Endpoints

Endpoint

Request parameters

Response fields

Examples

Streaming

Example response

Overview

Endpoints

Documentation Index

​Endpoint

​Request parameters

​Response fields

​Examples

​Streaming

​Example response

Endpoint

Request parameters

Response fields

Examples

Streaming

Example response