1. Introduction

Purpose of the document

Overview of the ChattyAI system

2. Token consumption cost in a GPT system

In the context of the GPT (Generative Pretrained Transformer) model, a "token" refers to the smallest unit of text that the system can process.

When processing language, GPT breaks down text into chunks called tokens. These tokens can be as small as one character or as large as one word. For example, in English, the sentence "ChatGPT is great!" might be broken down into ["ChatGPT", "is", "great", "!"]. Each of these is a token.

The system then uses these tokens to understand the context and generate responses. Tokens are transformed and processed through the model's layers, with each layer learning different levels of language abstraction (e.g., syntax, semantics).

It's essential to note, for languages like Chinese, Korean, and Japanese, tokenization can be quite complex, as words are not easily separated by spaces like in English. For this reason, more sophisticated tokenization techniques, such as SentencePiece or Byte-Pair Encoding (BPE), are used.
model type input price output price
gpt-3.5-turbo-16k-0613 $0.003 / k tokens $0.004 k tokens
gpt-4 $0.03 / k tokens $0.06 / k tokens
Whisper-1 $0.006 / minute k tokens
// message per visitors
10875/2.3k = 4.7283

// cost per visitors
150$/2.3k = 0.06522$

// real chat request received 5408
gpt-3.5 assitant messages = 1926, 
gpt-3.5 user messages = 1920, 

gpt-4 assistant messages = 3507,
gpt-4 user messages = 3485,

// cost per messages
150$/5408 = 0.02774$

Screenshot 2023-07-05 at 22.51.24.png