Kamil Józwik
LLM model logo

GPT

GPT (Generative Pre-trained Transformer) is a family of AI models developed by OpenAI. The family prioritizes iterative deployment with extensive safety measures, releasing systems through careful real-world testing and alignment research. The GPT-5 generation launched in August 2025, representing a unified approach spanning ultra-lightweight models to extended reasoning systems with context windows up to 400,000 tokens.

The family currently includes foundation reasoning models, lightweight variants, specialized coding models, open-weight models, and speech-to-speech systems. Key differentiators include state-of-the-art performance across reasoning and multimodal tasks, comprehensive model range addressing cost-performance trade-offs, native multimodal support, extended context capabilities, and production-ready infrastructure with extensive tooling.

OpenAI's first open-weight models since GPT-2 (the gpt-oss series under Apache 2.0 license) provide on-premises deployment options alongside API-only flagship models. The family excels at complex reasoning, code generation, long-context understanding, and real-time audio interactions, backed by enterprise-grade APIs and strong built-in safety systems.

Platform & Access

Platform Name: OpenAI Platform

Platform URL: https://platform.openai.com

What It Offers: API keys for programmatic access, interactive Playground for testing, model management with version control, fine-tuning capabilities, embeddings API, free Moderation API, real-time usage dashboards, Assistants API for agent-like experiences, Batch API (50% discount for large volumes), and tools including web search, code interpreter, function calling, and structured outputs.

Access Model:

Pricing Model: Token-based pay-as-you-go pricing charged per million tokens (separate input/output rates). Audio models use per-token or per-minute pricing. No free tier after initial credits expire. Usage limits scale through tiered system based on spending.

Foundation Reasoning Models

GPT-5

Primary Use Cases:

Agentic Capabilities:


GPT-5 Pro

Primary Use Cases:

Agentic Capabilities:


Lightweight Models

GPT-5 mini

Primary Use Cases:

Agentic Capabilities:


GPT-5 nano

Primary Use Cases:

Agentic Capabilities:


Specialized Coding Models

GPT-5-Codex

Primary Use Cases:

Agentic Capabilities:


Open-Weight Reasoning Models

gpt-oss-120b

Primary Use Cases:

Agentic Capabilities:


gpt-oss-20b

Primary Use Cases:

Agentic Capabilities:

gpt-oss-safeguard-20b + gpt-oss-safeguard-120b

gpt-oss-safeguard is a new set of open source models built for flexible security classification. The models come in two sizes, 120b and 20b, and are available under the Apache 2.0 license for anyone to use and modify. Unlike traditional classifiers that need to be retrained whenever safety rules change, these models can interpret policies in real time, according to OpenAI. This lets organizations update their rules instantly, without retraining the model.

The models are designed to be more transparent as well. Developers can see exactly how the models make decisions, making it easier to understand and audit how security is enforced. gpt-oss-safeguard is based on OpenAI's gpt-oss open source model and is part of a larger collaboration with ROOST, an open source platform focused on building tools and infrastructure for AI safety, security, and governance.


Real-Time Audio Models

gpt-realtime

Primary Use Cases:

Agentic Capabilities:


gpt-realtime-mini

Primary Use Cases:

Agentic Capabilities:


Speech Recognition

Whisper

Primary Use Cases:

Agentic Capabilities:


Model Comparison Table

These are just the latest and most relevant models. See all available models in the OpenAI Model Index.

ModelContextParametersKnowledge CutoffAgentic UsageBest For
GPT-5400KUndisclosedSept 2024⭐⭐⭐⭐⭐Complex reasoning, long context
GPT-5 Pro400K+UndisclosedSept 2024⭐⭐⭐⭐⭐Mission-critical, extended reasoning
GPT-5 mini400KUndisclosedMay 2024⭐⭐⭐Cost-sensitive, moderate complexity
GPT-5 nano400KUndisclosedMay 2024⭐⭐High-volume, latency-sensitive
GPT-5-Codex400KUndisclosedBased on GPT-5⭐⭐⭐⭐⭐Autonomous coding, large refactoring
gpt-oss-120b128K117B (5.1B act.)June 2024⭐⭐⭐⭐On-premises, production deployment
gpt-oss-20b128K21B (3.6B act.)June 2024⭐⭐⭐Edge/local, consumer hardware
gpt-realtime32KUndisclosedOct 2023⭐⭐⭐⭐Real-time voice, complex tool use
gpt-realtime-mi32KUndisclosedOct 2023⭐⭐Cost-effective voice interactions
Whisper large30s1,550MN/AN/AHigh-accuracy transcription, 99 langs

Key Considerations

GPT-5 Pro Not API-Accessible: Available only through ChatGPT Pro subscription ($200/month), not via standard API. Use GPT-5 with high reasoning_effort for API-based extended reasoning.

Tool Definition Size Limitations: GPT-5 family has known issue where large tool/function definitions exceeding 300K tokens cause failures even within 1M context limit. Keep tool definitions concise.

gpt-oss Harmony Format Required: Both open-weight models require custom Harmony response format with special tokens for correct operation. Not compatible with standard chat templates.

Rate Limits Multi-Dimensional: Limits measured across RPM, RPD, TPM, TPD, IPM, and batch queue limits. Can hit any dimension first. Limits automatically increase through usage tiers based on spending.

Whisper Limitations: 30-second processing window, potential hallucinations due to weak supervision training, turbo model does NOT support translation tasks (only multilingual variants support translation).

Regional Availability: Available in 161 countries. Not available in 20+ countries due to legal/regulatory compliance and U.S. sanctions.


Resources