What the GPT Image 2 API returns
A single call returns one or more image objects, each with a base64-encoded payload (or a signed URL when using the streamed variant), the resolved resolution, the seed, and a usage block with tokens_in, tokens_out, and the priced tier.
Authentication and SDK setup
npm install openai # Node
pip install openai # Pythonfrom openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])Your first call to gpt-image-2
result = client.images.generate(
model="gpt-image-2",
prompt='A glass storefront with a neon sign reading "Open 24h", cinematic, 50mm',
size="2048x2048",
quality="high",
n=1,
)
print(result.data[0].b64_json[:100])The first call should return in 3–8 seconds for standard 2K. If you see >15 seconds, you have most likely enabled thinking mode by accident — check the reasoning parameter.
Image-to-image and multi-reference requests
result = client.images.edit(
model="gpt-image-2",
prompt="Keep the subject from ref 1, restyle with palette of ref 2.",
image=[open("subject.png","rb"), open("palette.png","rb")],
size="2048x2048",
)Thinking mode via the API
result = client.images.generate(
model="gpt-image-2",
prompt="Marketing pack: hero, square, story, banner from this brief",
reasoning="auto", # or "force"
n=4,
)Use reasoning="auto" for cost-sensitive paths. Use "force" only when accuracy must beat latency (multi-panel comics, infographics with hard facts).
Rate limits, caching and cost control
| Tier | Requests / min | Images / min |
|---|---|---|
| Free / Tier 1 | 5 | 5 |
| Tier 2 (auto after $50 spend) | 50 | 50 |
| Enterprise (Azure / fal) | 500+ | 500+ |
- Cache identical prompts: same prompt + seed = same image.
- Default to
standardquality. Reservehighand 4K for shipping assets. - Use thinking mode
auto, neverforcein cron jobs. - Validate prompts client-side to avoid wasting credits on empty or banned input.
Migration: dall-e-3 to gpt-image-2
- model: "dall-e-3",
+ model: "gpt-image-2",
- size: "1024x1024",
+ size: "2048x2048",
- quality: "hd",
+ quality: "high",Three behavioral differences to test in staging: GPT Image 2 is more literal, payloads are larger, and quoted strings for in-image text are honored. Full side-by-side breakdown on GPT Image 2 vs DALL-E 3.
Pricing detail on the pricing page; full walkthrough on how to use GPT Image 2.