Overview
The NanoGPT Batch API is an OpenAI-compatible way to run large numbers of chat completion requests asynchronously. The first version supports/v1/chat/completions requests for supported GPT, Claude, and Gemini models, including text prompts and image inputs.
Batch jobs are best for offline workloads where latency is not important, such as classification, summarization, evals, synthetic data generation, document processing, and image analysis.
Batch jobs upload a JSONL file, create a batch, poll the batch, and download the output file after completion.
Base URL
Use the dedicated NanoGPT OpenAI-compatible API host for all Batch API requests:api.nano-gpt.com for file upload, batch creation, polling, cancellation, and output downloads. Do not use nano-gpt.com for batch uploads, because larger multipart uploads can be rejected by the website host before they reach the Batch API.
All requests require an API key:
Supported Endpoints
Files
POST /filesGET /files/{file_id}GET /files/{file_id}/content
Batches
POST /batchesGET /batches/{batch_id}GET /batchesPOST /batches/{batch_id}/cancel
Supported Batch Request Endpoint
Only this endpoint is supported inside batch JSONL rows:/v1/responses/v1/completions/v1/embeddings- Image generation, audio, video, transcription, TTS, moderation, and other non-chat endpoints
Supported Models
Batch jobs support the OpenAI/GPT chat model IDs below, plus the listed Claude and Gemini model IDs.GPT Models
Supported direct GPT/OpenAI chat model IDs include:gpt-5.4gpt-5.4-minigpt-5.4-nanogpt-5.4-progpt-5.3-chatgpt-5.3-chat-latestgpt-5.3-codexgpt-5.2gpt-5.2-chat-latestgpt-5.2-progpt-5.1gpt-5.1-chatgpt-5.1-chat-latestgpt-5.1-codexgpt-5.1-codex-maxgpt-5gpt-5-chat-latestgpt-5-minigpt-5-nanogpt-4.1gpt-4.1-minigpt-4.1-nanogpt-4ogpt-4o-minichatgpt-4o-latesto1o1-minio1-previewo3o3-minio4-mini
openai/ prefix is also accepted for supported OpenAI models. For example, openai/gpt-4.1 is normalized to gpt-4.1.
Claude Models
Supported Claude batch models:anthropic/claude-sonnet-4.5anthropic/claude-sonnet-4anthropic/claude-haiku-4.5anthropic/claude-opus-4.5anthropic/claude-opus-4.1anthropic/claude-opus-4
anthropic/claude-sonnet-4.5:thinking:lowanthropic/claude-sonnet-4.5:thinking:mediumanthropic/claude-sonnet-4.5:thinking:highanthropic/claude-sonnet-4.5:thinking:16000
max_tokens.
Gemini Models
Supported Gemini batch models:google/gemini-3.5-flashgoogle/gemini-3.1-pro-previewgoogle/gemini-3-flash-previewgoogle/gemini-3.1-flash-litegoogle/gemini-3.1-flash-lite-previewgoogle/gemini-2.5-progoogle/gemini-2.5-flashgoogle/gemini-2.5-flash-lite
Request Rules
Each batch file must follow these rules:- File format must be JSONL: one JSON object per line.
- File upload must use
purpose=batch. - Every row must include a unique non-empty
custom_id. - Every row must use
method: "POST". - Every row must use
url: "/v1/chat/completions". - Every row must include a
bodyobject. - Every row must include
body.messages. - Every row must include
body.model. - Every row must include
body.max_tokensorbody.max_completion_tokens. - All rows in one file must use the same model.
- All rows in one file must use the same model family.
- Do not mix GPT, Claude, and Gemini rows in one batch.
- Streaming is not supported.
stream: trueis rejected. - GPT, Claude, and Gemini batch jobs support text and image-input chat messages.
- Image inputs must use OpenAI-compatible
image_urlcontent parts. - Image inputs must be placed in
usermessages. - Image URLs may be
http://,https://, or base64data:image/...;base64,...URLs. - Supported image media types are PNG, JPEG, GIF, and WebP.
- Tools, functions,
response_format, audio, and non-image multimodal request bodies are not supported in this first version.
Billing
Batch jobs use NanoGPT account balance. Subscription included-token logic is not applied to batch jobs. At batch creation time, NanoGPT checks the account balance against a conservative maximum liability estimate. This is based on the uploaded requests and each row’smax_tokens or max_completion_tokens.
The final charge is created after the batch reaches a terminal status and actual usage is available. Completed usage is billed once, idempotently. If a job produces no billable usage, no usage charge is created.
Batch jobs are discounted versus normal synchronous API usage. Use the model pricing page or API pricing endpoint as the source of truth for current prices.
Example Input File
Createbatch.jsonl:
Upload The File
Create The Batch
Poll The Batch
validatingin_progressfinalizingcompletedfailedexpiredcancellingcancelled
status is completed, output_file_id should be present. If individual rows failed, error_file_id may also be present.
Download Output
custom_id.
Example output line:
List Batches
limitafter
Cancel A Batch
Common Errors
Unsupported endpoint
The first version only supports/v1/chat/completions.
Missing max output cap
Every row must includemax_tokens or max_completion_tokens. This is required so NanoGPT can estimate the maximum possible liability before submitting the job.
Mixed model
All rows in one batch file must use the same model.Mixed model family
Do not mix GPT, Claude, and Gemini requests in one batch file.Streaming unsupported
Batch jobs are asynchronous and do not stream. Removestream: true.
Invalid image input
Image inputs must useimage_url parts with an http://, https://, or supported base64 data:image/...;base64,... URL. Local file paths, file:// URLs, unsupported media types, and malformed data URLs are rejected.