Every AI model request you make through 元任务 AI 网关 consumes tokens, and your account balance decreases accordingly. Understanding how token counting works helps you predict costs, read your usage data accurately, and make informed decisions when choosing between models.Documentation Index
Fetch the complete documentation index at: https://docs.metask.ai/llms.txt
Use this file to discover all available pages before exploring further.
What are tokens?
Tokens are the units AI models use to process text. A token is roughly 3–4 characters in English — about 75% of a word. As a practical approximation:- 1,000 tokens ≈ 750 words
- A short paragraph ≈ 100–200 tokens
- A detailed technical prompt ≈ 500–2,000 tokens
How the gateway meters usage
Every request has two token components:- Input tokens (also called prompt tokens): the text you send — your system prompt, conversation history, and user message.
- Output tokens (also called completion tokens): the text the model generates in response.
Reading usage from API responses
Every chat completions response includes ausage object that reports the token counts for that request:
| Field | Description |
|---|---|
usage.prompt_tokens | Tokens consumed by your input (system prompt + messages) |
usage.completion_tokens | Tokens generated in the model’s response |
usage.total_tokens | Sum of prompt and completion tokens |
total_tokens to understand the cost of a single request. Over time, tracking prompt_tokens separately helps you identify if a growing conversation history or large system prompt is driving up costs.
Checking your balance and usage
Log in to the 元任务 dashboard at napi.origintask.cn to view:- Your current account balance
- Historical token usage by model and time period
- Individual request logs with per-request token counts
Managing your balance
Top up your balance
In the dashboard, navigate to Billing and select Add Balance. Choose an amount and complete the payment. Your balance is available immediately after the transaction is confirmed.
Apply a promo code
If you have a promo code, enter it in the Promo Code field in the Billing section. Promo codes add credit directly to your balance.
Tips for managing costs
Choose the right model for the task. Smaller, faster models likegpt-4o-mini or claude-3-haiku-20240307 cost significantly less per token than frontier models. Use them for classification, extraction, summarization, and other tasks that don’t require the highest capability.
Keep system prompts concise. Your system prompt is included in prompt_tokens for every request in a session. A 2,000-token system prompt adds 2,000 tokens of cost to every single call. Trim it to what’s necessary.
Limit conversation history. In multi-turn conversations, you send the full message history with each request. Truncate or summarize older messages to prevent unbounded token growth.
Use streaming for long responses. Streaming ("stream": true) doesn’t reduce token usage, but it lets you stop generation early if the model begins producing irrelevant output, avoiding wasted completion tokens.
Set max_tokens limits. Cap the maximum response length with the max_tokens parameter to prevent unexpectedly long completions from consuming more tokens than you need.
Token counts in the
usage field reflect the actual tokens processed by the model. They may differ slightly from estimates produced by local tokenizer libraries, which are useful for budgeting but not authoritative.