[ { "title": "GPT Image 2 Model Card | OpenAI API", "description": "OpenAI 官方模型卡,是 routing 与接入的硬基准: snapshot id、endpoints、rate limits、不支持的能力都在这里;没有它就没法判断 image-2 在 API 工程链路上的边界。", "cover": "", "author": "OpenAI", "body": "## Overview\n\n\"State-of-the-art image generation model for fast, high-quality image generation and editing.\" GPT Image 2 demonstrates highest performance with medium speed characteristics.\n\n## Supported Modalities\n- **Input:** Text, Image\n- **Output:** Image\n\n## Key Capabilities\n\nThe model supports flexible image sizes and high-fidelity image inputs. Notable features include image editing functionality (via the `/v1/images/edits` endpoint) and inpainting capabilities.\n\n## Unsupported Features\n- Streaming\n- Function calling\n- Structured outputs\n- Fine-tuning\n- Distillation\n- Predicted outputs\n\n## API Endpoints\n- Primary: `v1/images/generations`\n- Edits: `v1/images/edits`\n\n## Rate Limits (by tier)\n\n| Tier | TPM | IPM |\n|------|-----|-----|\n| Tier 1 | 100,000 | 5 |\n| Tier 2 | 250,000 | 20 |\n| Tier 3 | 800,000 | 50 |\n| Tier 4 | 3,000,000 | 150 |\n| Tier 5 | 8,000,000 | 250 |\n\n## Snapshot\n\nCurrent version: `gpt-image-2-2026-04-21`", "images": [], "channel": "other", "url": "https://developers.openai.com/api/docs/models/gpt-image-2", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "OpenAI 官方文档" }, { "title": "Image generation | OpenAI API", "description": "API 指南给了 size/quality/format/n/moderation 全部参数表 + Python 代码样例 + mask 要求 + content policy + 价格区间;\"复杂 prompt 处理可达 2 分钟\"和\"text rendering / consistency / precise composition 仍是难点\"两段官方原话尤其重要,直接界定生产部署的预期。", "cover": "", "author": "OpenAI", "body": "## Overview\n\nOpenAI provides two primary APIs for image generation and editing:\n\n1. **Image API** – Direct generation/editing of single images with dedicated endpoints\n2. **Responses API** – Image generation integrated into multi-turn conversations with built-in tools\n\nThe Responses API offers advantages like iterative editing and flexible file inputs (File IDs, URLs, or base64), while the Image API suits one-off generation tasks.\n\n## Key Capabilities\n\n### Generation\nCreate images from text prompts using `gpt-image-2` or earlier models. You can generate multiple images per request using the `n` parameter.\n\n### Editing\n- Modify existing images with new prompts\n- Use multiple reference images to generate new compositions\n- Apply masks to specify which regions should be edited\n\n### Multi-turn Workflows\nThe Responses API supports iterative image refinement across conversation turns, with the model automatically deciding whether to generate or edit based on context (or via explicit `action` parameter).\n\n## Code Example: Basic Generation (Image API)\n\n```python\nfrom openai import OpenAI\nimport base64\n\nclient = OpenAI()\n\nresult = client.images.generate(\n model=\"gpt-image-2\",\n prompt=\"A children's book drawing of a veterinarian with a baby otter.\"\n)\n\nimage_base64 = result.data[0].b64_json\nwith open(\"output.png\", \"wb\") as f:\n f.write(base64.b64decode(image_base64))\n```\n\n## Customization Parameters\n\n| Parameter | Options | Notes |\n|-----------|---------|-------|\n| **size** | `1024x1024`, `1536x1024`, `2048x2048`, `3840x2160`, `auto` | Max edge: 3840px; ratio ≤ 3:1 |\n| **quality** | `low`, `medium`, `high`, `auto` | Higher = slower but better detail |\n| **format** | `png`, `jpeg`, `webp` | JPEG is fastest |\n| **output_compression** | 0–100 | For JPEG/WebP only |\n| **moderation** | `auto`, `low` | Controls content filter strictness |\n\n## Important Limitations\n\nThe documentation notes that GPT Image models face challenges with \"text rendering, consistency across generations, and precise element composition\" despite improvements. Processing complex prompts may require up to 2 minutes.\n\n## Mask Requirements (Editing)\n\nWhen using masks: both the image and mask must be identical in format/size (<50MB), and the mask requires an alpha channel. Python users can programmatically add an alpha channel using PIL before submission.\n\n## Content Policy\n\n\"All prompts and generated images are filtered in accordance with our content policy.\" Organization verification may be required before accessing GPT Image models.\n\n## Pricing Notes\n\n`gpt-image-2` pricing varies by quality and size (e.g., $0.006 for low-quality 1024×1024). Earlier models use token-based calculation. Partial images during streaming each cost 100 additional output tokens.", "images": [], "channel": "other", "url": "https://developers.openai.com/api/docs/guides/image-generation", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "OpenAI 官方文档" }, { "title": "GPT Image Generation Models Prompting Guide", "description": "Cookbook 等价于 OpenAI 自己的最佳实践答案;给了 prompt 结构(场景→主体→细节→约束)、文字技巧、约束与多参考图编排、迭代策略,以及 image-2 / 1.5 / 1 / mini 的档位选择—\"最佳实践\"问题的官方原文,无可替代。", "cover": "", "author": "OpenAI Cookbook", "body": "## Core Prompt Structure\n\nThe guide emphasizes organizing prompts consistently: \"background/scene → subject → key details → constraints\" with clear intended use. For complex requests, use labeled segments or line breaks rather than single paragraphs.\n\n## Key Recommendations by Element\n\n**Specificity & Quality Cues:**\n- Be concrete about materials, textures, and visual medium (photo, watercolor, 3D render)\n- For photorealism, include the word \"photorealistic\" directly, or use phrases like \"real photograph\" or \"professional photography\"\n- Add quality levers only when needed (film grain, brushstrokes, macro detail)\n\n**Text in Images:**\n- Place literal text in quotes or ALL CAPS\n- Specify typography details: font style, size, color, placement\n- For tricky words, spell them letter-by-letter to improve accuracy\n- Use medium or high quality for small text and dense layouts\n\n**Composition:**\n- Specify framing (close-up, wide, top-down) and viewpoint (eye-level, low-angle)\n- Call out placement explicitly: \"logo top-right,\" \"subject centered\"\n- Include lighting/mood descriptors (soft diffuse, golden hour, high-contrast)\n\n**People & Pose:**\n- Describe scale, body framing, and gaze direction\n- Examples: \"full body visible, feet included\" or \"looking down at the book, not at camera\"\n- These details control body proportion, action geometry, and alignment\n\n**Constraints (Critical):**\n- State exclusions explicitly: \"no watermark,\" \"no logos/trademarks,\" \"preserve identity\"\n- For edits, use \"change only X\" + \"keep everything else the same\"\n- Repeat preserve lists on each iteration to reduce drift\n- Be surgical with edits: specify what to preserve about saturation, contrast, layout, and camera angle\n\n**Multi-Image Inputs:**\n- Reference each input by index and description: \"Image 1: product photo… Image 2: style reference…\"\n- Describe interactions: \"apply Image 2's style to Image 1\"\n- When compositing, be explicit about element placement\n\n**Iteration Strategy:**\n\n> \"Iterate instead of overloading: Long prompts can work well, but debugging is easier when you start with a clean base prompt and refine with small, single-change follow-ups.\"\n\n## Model Distinctions\n\n**gpt-image-2** (recommended):\n- Strongest overall model for production workflows\n- Supports any resolution (subject to pixel/ratio constraints)\n- quality: low works well for high-volume generation and experimentation\n- Excels at photorealism, text-heavy images, identity-sensitive edits, and detailed infographics\n- Does not support input_fidelity parameter\n\n**gpt-image-1.5 & gpt-image-1:**\n- Keep only for backward compatibility during migration\n- Support input_fidelity (low/high) for edit workflows\n- Fixed resolutions: 1024×1024, 1024×1536, 1536×1024\n\n**gpt-image-1-mini:**\n- Use when cost/throughput dominate: batch variants, rapid ideation, previews, draft assets\n\n## Quality Settings\n\n- **low**: Good for latency-sensitive use cases; sufficient fidelity in many scenarios\n- **medium/high**: Recommended for small/dense text, detailed infographics, close-up portraits, identity-sensitive edits, and high-resolution outputs\n\n## Production-Focused Patterns\n\nFor **infographics & diagrams**: Use \"high\" quality for dense layouts and in-image text.\n\nFor **photorealism**: Prompt as if capturing a real moment; use photography language (lens, lighting, framing) and request real texture (pores, wrinkles, fabric wear, imperfections). Avoid studio-polish language.\n\nFor **identity preservation** (try-on, compositing): \"Explicitly lock the person (face, body shape, pose, hair, expression) and allow changes only to garments.\"\n\nFor **text accuracy in marketing creatives**: Quote exact copy, demand verbatim rendering with \"no extra characters,\" describe placement and font style, and keep prompts strict when iterating.", "images": [], "channel": "other", "url": "https://developers.openai.com/cookbook/examples/multimodal/image-gen-models-prompting-guide", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "OpenAI 官方 Cookbook" }, { "title": "Introducing gpt-image-2 - available today in the API and Codex", "description": "OpenAI 在 Dev Forum 自己发的发布帖,等于一手公告(官方 index 页 403 抓不到时唯一可信替代);用户回复里两个抱怨—Tier1 仅 5 IPM 与其他服务\"20x 差距\"、Codex OAuth token 端点限制—是\"不适合什么\"的官方旁证。", "cover": "", "author": "OpenAI Developer Community", "body": "## Official Capabilities\n\nOpenAI's `gpt-image-2` launched on April 21, 2026, described as \"OpenAI's most capable image generation model yet.\" Key features include:\n\n- **Flexible outputs**: \"More aspect ratios and resolutions up to 2K for apps, ads, product flows, social, presentations, and docs\"\n- **Text rendering**: \"Stronger structured generation (diagrams, infographics, charts, posters, comics) and improved multilingual text rendering\"\n- **Control & instruction-following**: More reliable composition and detail preservation\n- **Reasoning integration**: \"With reasoning models, can research, transform inputs, generate variations, and self-check\"\n\n## Performance Claims\n\nThe model achieved \"#1 spot across all Image Arena leaderboards\" with \"an unprecedented +242 point lead in Text-to-Image\" hours after launch.\n\n## Pricing Structure\n\n| Modality | Input | Cached Input | Output |\n|----------|-------|--------------|--------|\n| Image | $8.00 | $2.00 | $30.00/1M tokens |\n| Text | $5.00 | $1.25 | $10.00/1M tokens |\n\n## User Concerns & Limitations\n\n**Rate limits**: One developer noted the \"highest rate limit for gpt-image is 250 IPM,\" questioning why this is \"20x difference\" compared to other services.\n\n**API access friction**: A Codex user reported that OAuth tokens only support specific endpoints: \"the trouble is on chatgpt Codex plan all we got is: `backend-api/codex/responses`\"\n\n**Enterprise availability**: Coming \"soon\" but not yet available on Enterprise and Edu tiers.", "images": [], "channel": "other", "url": "https://community.openai.com/t/introducing-gpt-image-2-available-today-in-the-api-and-codex/1379479", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "openai.com 公告页 403,本帖是一手替代" }, { "title": "ChatGPT's new Images 2.0 model is surprisingly good at generating text", "description": "TechCrunch 主流科技媒体 hands-on 评测;给了 DALL-E 3 时代 \"enchuita / churiros\" 乱码到 image-2 完整菜单的对照,以及\"复杂任务(漫画)需要几分钟\"的真实时延数—不是官方号称的 3 秒。", "cover": "images/techcrunch-image2-01.png", "author": "Amanda Silberling, TechCrunch", "body": "## Key Strengths\n\nThe article highlights impressive improvements in text rendering. Amanda Silberling demonstrates this with a Mexican restaurant menu example—the new model generates readable text without the spelling errors that plagued earlier versions like DALL-E 3, which produced nonsense like \"enchuita\" and \"churiros.\"\n\nOpenAI claims the model delivers \"unprecedented level of specificity and fidelity to image creation\" and can handle \"small text, iconography, UI elements, dense compositions\" up to 2K resolution.\n\n## Notable Capabilities\n\n- **Multi-format generation**: Creates marketing assets in various sizes and multi-paneled comic strips\n- **Multilingual text**: Stronger understanding of non-Latin scripts (Japanese, Korean, Hindi, Bengali)\n- **Web integration**: \"Thinking capabilities\" enable web searches and multiple iterations\n- **Quality checking**: Can verify and refine outputs\n\n## Weaknesses & Limitations\n\nThe reviewer notes generation isn't instant—complex outputs like comic strips require \"just a few minutes\" rather than seconds. OpenAI declined to reveal the underlying model architecture, though experts suggest autoregressive approaches may replace older diffusion models that struggled with text.\n\nKnowledge cutoff at December 2025 may affect accuracy on recent topics.", "images": [ "images/techcrunch-image2-01.png" ], "channel": "other", "url": "https://techcrunch.com/2026/04/21/chatgpts-new-images-2-0-model-is-surprisingly-good-at-generating-text/", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "GPT-image-2 vs GPT-image-1.5: 8 Major Upgrades", "description": "8 项升级是 v1.5→v2 量化差量的最干净总结(文字 99%、速度 ~3s、4K 分辨率、CJK+RTL、API 兼容);自带 \"preview-quality\" 标注是诚实信号—这种带 caveat 的资料比号称体验更可信。", "cover": "", "author": "Apiyi.com Blog", "body": "## The 8 Upgrades\n\n### 1. Text Rendering\n**Change:** From handling 1-5 word titles to \"~99% character-level accuracy\"\n\n**Implication:** UI labels, signs, and posters no longer require manual post-processing. Localized ad creatives can be generated directly without Photoshop layout work.\n\n### 2. Generation Speed\n**Change:** From 8-18 seconds down to ~3 seconds\n\n**Implication:** Interactive UX experiences become viable; batch pipeline throughput increases 3-5x.\n\n### 3. Maximum Resolution\n**Change:** From 1536×1024 to native 2048×2048 / 4096×4096\n\n**Implication:** Commercial printing, posters, and large-format ads are now viable without upscaling.\n\n### 4. Aspect Ratio Support\n**Change:** Added 16:9 widescreen support (previously only 1:1, 4:3, 3:4)\n\n**Implication:** Video thumbnails, YouTube graphics, and social media assets can be generated natively without cropping.\n\n### 5. Photorealism / Artifact Reduction\n**Change:** Elimination of persistent \"AI yellow filter\"; improved hand anatomy and reflection accuracy\n\n**Implication:** Product photography and portrait-level imagery \"difficult to distinguish from real photos.\"\n\n### 6. World Knowledge & Brand Accuracy\n**Change:** Deep understanding of real brands, interfaces, and environments (IKEA stores, YouTube UI, Minecraft scenes)\n\n**Implication:** Realistic brand-accurate mockups and UI screenshots viable for design exploration.\n\n### 7. Multilingual Text Support\n**Change:** Clear rendering of Latin, CJK (Chinese/Japanese/Korean), and RTL (Arabic/Hebrew) scripts\n\n**Implication:** Global teams can generate localized ad creatives and multilingual UI mockups \"without manual typesetting.\"\n\n### 8. Architecture & API Compatibility\n**Change:** Shifted from two-stage to single-stage inference; maintains API compatibility with gpt-image-1.5\n\n**Implication:** Seamless migration—only the `model` field needs updating on launch day; existing keys and billing unchanged.\n\n## Caveats & Weaknesses\n\n- **No official release yet** (as of article date: April 2026); all claims based on LM Arena beta testing\n- **Pricing unknown** (gpt-image-1.5 was 20% cheaper than gpt-image-1; gpt-image-2 pricing unconfirmed)\n- **Rate limits uncertain** during initial launch\n- **Potential differences** between beta and final release versions\n\n## Key Caveat\n\n> \"Please treat gpt-image-2 data as preview-quality until the official release.\"", "images": [], "channel": "other", "url": "https://help.apiyi.com/en/gpt-image-2-vs-gpt-image-1-5-upgrade-8-features-en.html", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "GPT Image 2: Complete Guide to OpenAI's Latest Image Model (2026)", "description": "整篇资料里**唯一**直接给出 \"GPT Image 2 is NOT optimal for X, use Y instead\" 段落,把 routing 决策直接外化到 Nano Banana Pro / Seedream 5 / image-1-mini—\"不适合做什么\"问题最干脆的答案。", "cover": "", "author": "CreateVision AI", "body": "## Key Use Cases Highlighted\n\n- **UI/Product Mockups**: Text rendering enables mockups with actual headlines and CTAs\n- **Marketing & Social Content**: Generate variants across square, vertical, and ultrawide formats\n- **Multilingual Signage/Packaging**: 4K output with accurate CJK character support\n- **Infographics & Charts**: Native reasoning improves layout consistency and text clarity\n- **Product Variants**: 16-reference editing mode maintains character/product consistency\n\n## Recommended Prompt Pattern\n\nInclude exact text in quotation marks within your prompt. Example: *\"A coffee cup mockup with 'CreateVision AI' on the side, terracotta-colored sleeve.\"*\n\n## Limitations & When Not to Use\n\n**GPT Image 2 is NOT optimal for:**\n- Pure photoreal portraiture (use Nano Banana Pro instead)\n- Speed-first batch generation (Nano Banana Pro is faster and cheaper)\n- Stylized editorial illustration (Seedream 5 performs better)\n\n## Pricing Summary\n\n| Tier | Cost |\n|------|------|\n| 1K low quality | 5 credits |\n| 1K medium (default) | 20 credits |\n| 1K high quality | 75 credits |\n| Per reference image | +10 credits (max 16) |\n\n**Free plan allocation**: 80 daily / 400 monthly credits = ~2 hero images daily at default settings", "images": [], "channel": "other", "url": "https://createvision.ai/guides/gpt-image-2-complete-guide", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "GPT Image - Wikipedia", "description": "完整 release timeline (v1 → mini → v1.5 → v2) 和历代弱点对照;判定\"我现在该不该升 v2\"或\"v1.5 还在哪些场景有优势\"时,这是最便捷的对照基准。", "cover": "", "author": "Wikipedia", "body": "## Release Timeline\n\n**GPT Image 1** launched March 25, 2025, as \"GPT-4o image generation.\" Initial rollout was restricted to paid users due to overwhelming demand. Sam Altman noted that \"our GPUs are melting\" from usage levels. Within the first week, over 130 million users generated more than 700 million images.\n\n**GPT Image 1 Mini** arrived October 6, 2025, offering \"80% less expensive\" API pricing than the original version.\n\n**GPT Image 1.5** debuted December 16, 2025, with improvements including faster generation (4x improvement), precise editing capabilities, and 20% cheaper API costs. However, it regressed in some artistic styles and maintained weaknesses with multiple faces and non-Latin languages.\n\n**GPT Image 2** released April 21, 2026, introducing \"a reasoning model into their generation.\"\n\n## Technical Architecture\n\nUnlike DALL-E predecessors using diffusion methods, GPT Image models employ autoregressive generation. The system supports three output dimensions: 1024×1024 (square), 1536×1024 (landscape), and 1024×1536 (portrait).\n\n## Known Limitations\n\nTechnical weaknesses documented by reviewers include:\n- Over-sharpening artifacts and warm color bias (addressed partially in v1.5)\n- Consistent struggles with rendering multiple faces\n- Difficulties with Chinese, Arabic, and Hebrew text generation (addressed in v2)\n- Occasional errors in human poses and object overlap representation\n\n## Cultural Impact\n\nThe Studio Ghibli aesthetic became a viral phenomenon upon launch. Sam Altman changed his Twitter profile to a Ghibli-inspired image. The White House posted a controversial image depicting migrant Virginia Basora-Gonzalez's arrest in this style, drawing criticism for trivializing immigration enforcement.\n\n## Critical Reception\n\nTechRadar praised GPT Image 1's photorealistic and stylized outputs with improved text rendering. Conversely, Heise Online highlighted technical limitations despite overall strong performance.", "images": [], "channel": "other", "url": "https://en.wikipedia.org/wiki/GPT_Image", "feedback": { "view_count": null, "like_count": null, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "如何看待 OpenAI 近期小范围内测的 GPT-image-2 生图模型?", "description": "把同一批 prompt 跑 GPT-Image-2 vs Nano Banana 2 的 11 类 case 硬核 PK,**routing 决策最具像化的资料**: 文字密集设计、品牌世界知识、UI 复刻 image-2 完胜;真人一致性 Banana 略胜;原文还披露\"对亚洲人一致性偏弱\"+ \"出图速度明显更慢\"两个关键限制。", "cover": "images/zhihu-pk-banana-01.jpg", "author": "卡尔 & 阿汤", "body": "一开始我对GPT-Image-2是没抱什么期望的,之前的Image-1.5拉了,Nano Banana Pro和2又有点太强了,沉寂了5个月的Image-2突然开始灰度,也没个发布会啥的,但是当灰度到我后,我发现一句话就能生成下面这张图的时候,我就知道,OpenAI这把稳了,这段时间包是搞到了很多很厉害的数据。\n\n我立马去搜了一下,看是不是真有这个人。还好,虽然名字一样,但是账号内容是不一样的。这个人是AI捏造的,但太真了,这光影,这氛围感,属于是我看到截图会愣个30秒才能反应过来。\n\n那我再生成一个影视飓风的首页试一下吧。说实话我恍惚了。。。就算上边是个假的,但是味真的太对了,是怎么做到封面一致性那么高的?\n\n我觉得今天这篇测评的方向,已经非常明确了。直接来场硬核对决,把我上次测评Nano Banana用的那些提示语,扔给GPT-Image-2再过一遍。\n\n## 一、文字渲染(招聘海报 / 咖啡海报)\n\n这是Nano Banana Pro的传统优势,我们先从这里开始。我做了一个包含多种不同大小的字体和复杂排版的海报设计需求。说实话,我觉得一眼就能看出来这个海报哪一个更好看,哪一个更真实。左边GPT-Image-2这个我就觉得真的有这样的一个奶茶品牌,而且这个奶茶的包装和它的整个风格,根本就和我们平时喝的果茶没什么区别。但是右边这个放在Banana刚出来的时候觉得还行,但现在看多了就觉得AI感很重。\n\n再来看看这张招聘海报,里面文字太多提示语太长。整体上来说,左边 GPT-Image-2 出来的风格,更像我平时在某团某聘上面看到的宣传海报,设计感强,整体的排版设计其实更符合招聘海报风格,包括它的文字,层级,以及添加的很多图标设计,细节都更到位。\n\n咖啡图里区别更明显,左边的 GPT-Image-2 会用更多偏向真实的素材进行辅助说明,也会选择像宋体这样比较偏细的字体,更有简洁感和高级感。海报的留白更符合审美,让人的视觉感受更舒服。\n\n## 二、产品展示\n\n产品展示这一块GPT用的是目前护肤类产品更喜欢使用的细衬线,加上简洁图标的方式。包括整个产品里面,精华瓶子里会冒出的微小气泡,以及瓶子上面展示的产品名字,搭配的英文名,毫升数,都更加符合真实产品的样子。尤其在赠品方面,它还根据我提供的图片画出了对应的赠品形式,感觉更像在淘宝页面里看到的图片了。\n\n## 三、课件 / 数学课本\n\n左边的GPT-Image-2就像是把我高中课本扫描了一遍,直接就给我出一页课件,Nano Banana 画的更像一个课本里的一张插图。\n\n## 四、真实世界场景(喜茶 / 便利店 / 厨房 / 黑悟空)\n\n左边的GPT-Image-2做出来的男生发型就会更真一点,而且怎么角落里还有个佳琦直播间啊!!我可以去猜一下这些来源的训练照片是什么时候的——右边Banana居然生成了一辆蓝色的 ofo单车,左边的GPT-Image-2里人物手里拿着的那瓶呢,我盲猜是红牛跟茉莉蜜茶的混合体。\n\n左边Image2里的这个围裙居然是建设银行的!\n\n黑悟空实机演示画面,玩过黑悟空的应该都会觉得上边的GPT-Image-2更真实吧,天命人跟杨戬的形象跟原游戏的风格很像,战斗模式也是第一人称视角的。\n\n## 五、UI 复刻(微信 / 电商 / 音乐播放器 / 抖音直播预告封面)\n\n微信聊天记录这局算是平手。再来看看电商首页的这个 UI 展示,这两个的首页其实都还挺像的,但是右边的 Banana 很喜欢用这种比较粗的字体,让画面看起来比较挤。再来看一个音乐播放器的 UI 界面,光是左边GPT-Image-2给我做了一个专辑封面,这一点就已经赢了。\n\n抖音直播的预告封面只能说左边GPT-Image-2真的赢麻了,它自己设计的这些内容以及看点,我看了都想直接拿来用。\n\n## 六、人物一致性\n\n上来就让它们生成十六宫格表情包。好家伙,原来芙莉莲可以有这么多表情的吗?不过我觉得其实两方的差别不是很大。硬要说的话,我其实更喜欢左边GPT-Image-2给我分格的形式,右边所有芙莉莲的耳朵都连在一起了。\n\n哈利波特这场我给到平局。右边Banana在保持人物脸型和发型的一致性上做得非常好,中间Image-2则在表情的多样性上小胜一把。\n\n## 七、产品 + 联名 / 海报复刻\n\n肯德基联名小猫海报:可以看到两边的小猫的形象都还和原图保持了一致。但是整体的画面丰富度,包括下面还有一个联名限定,我觉得都是中间的GPT-Image-2做的会更好。右边 Banana 做出来的就有点呆了。\n\n海报复刻:左边我给了一张风格化非常强的一版海报,然后丢给Image2和Banana,让他们去复刻,把里面春天场景的内容改成冬天。如果要论细节的话,可能中间Image-2那张图的九宫格截图更具有胶片的颗粒感,与原图的一致性会更高一点点。\n\n## 八、图片翻译\n\n真要说的话,右边GPT-Image-2的文字没有翻译完整,人物的名字没有翻译,但是中间这个Banana的翻译效果当时我抽了很久,文字总有那么几次不太稳定的时候,真要论文字稳定Image-2我可以给到夯!\n\n## 总结\n\nimage-2 综合实力新王。但限制是:**对亚洲人的一致性没有那么好**,**出图速度明显比 Nano banana 2 更慢**。\n\n@ 作者 / 卡尔 & 阿汤", "images": [ "images/zhihu-pk-banana-01.jpg", "images/zhihu-pk-banana-02.jpg", "images/zhihu-pk-banana-03.jpg", "images/zhihu-pk-banana-04.jpg" ], "channel": "zhihu", "url": "https://www.zhihu.com/question/2028780908405170838/answer/2028802266493141629", "feedback": { "view_count": null, "like_count": 647, "comment_count": null, "collect_count": null, "share_count": null }, "note": "原文 48 张配图,精选 4 张代表海报/咖啡/产品/课件 4 类典型胜场" }, { "title": "实测GPT-image-2,设计行业真的完蛋了吗?", "description": "把 image-2 的质变拆成\"文字 / 世界知识 / 修改精准 / 审美\"四大跃迁,并给了离谱级的具体 case—一句话生成电商详情页长图、Grok 假个人主页连人设细节都自动补全;\"为什么 image-2 是质变\"最强的叙事样本。", "cover": "images/zhihu-design-end-02.jpg", "author": "知乎专栏作者", "body": "被炒的沸沸扬扬的GPT-image-2,终于在今天凌晨,一场直播后。正式上线了。而GPT-image-2的效果,我说实话,实测完以后,我确实只能用震撼来形容。比Nano Banana 2的效果,直接抬了好几个台阶。\n\n相比于之前的所有绘图模型,**世界知识、文字渲染、修改精准度、还有图片审美**,就是我觉得这次GPT-image-2最离谱的进步。\n\n## 一、文字渲染\n\n文字渲染这个事,一直是所有AI图像模型最大的痛点。没有之一。之前不管是DALL-E还是Seedream还是Nano Banana 2,你让它在图里生成比较多的文字海报,比如招聘海报之类的。大概率会出现各种各样的鬼畜。\n\n可现在呢。都不说英文了,**GPT-image-2的中文渲染,真的极度的离谱**。比如,直接默写一个出师表。我第一次见到这么多次,绝大多数还稳定的。还有报纸。还能生成数学试卷。还有群友做的,能帮你代写情书。还有红楼梦的关系图。还有我直接把职位JD给GPT,然后直接生成的我们的招聘海报。\n\n## 二、世界知识\n\n这个是我觉得GPT-image-2最离谱的一个能力,也是我觉得跟其他所有模型拉开差距最大的地方。世界知识的意思,就是这个模型对真实世界长什么样,有着极其精准的理解。\n\n比如你让它生成一张YouTube首页的截图,它不是随便画一个红色播放按钮然后乱填一些文字。它会画出正确的布局、正确的按钮样式、正确的图标位置,甚至连各个视频的封面,都是正确的。\n\n生成一张小红书界面个人主页截图但是是Grok的个人主页,它**自己直接给Grok编了一套完整的人设。128.6万粉丝、302.1万获赞、AI来自xAI**,目标是理解宇宙并以幽默和真相回应一切问题。。。这个细节量,已经不是画图的范畴了。\n\n还有游戏的,生成一张三角洲跑刀代肝的图,得有一个大的1000比56。我甚至都没说,那1000和56是什么。他自己直接给我补上了1000万哈夫币比56人民币。甚至还补上了无数的优点,比如下面高效代肝、稳定比例、安全无封、全天接单四个卖点,还有那一句:效率看得见,实力不吹牛。\n\n## 三、精准度\n\n第三个核心升级,修改精准度。我把公司桌面摆件 3D 打印照片丢给GPT-Image-2,说了一句话,帮我生成一张图片,将该产品进行精修,可重新打光,精修优化,白色的背景。出来的效果,直接就是完美的电商产品抠图主图的水平。然后我跟他说,帮我做一张这个产品的电商详情页海报。它直接给我生成了一整张产品详情长图。**以前做这种详情页,我们设计师至少要搞两三天,拍产品照、修图、做排版、写文案、做分区详情、做场景图。现在两句话搞定了。**\n\n## 四、审美\n\n之前GPT画图就被人非常的诟病,审美上还是差。但,GPT-Image-2出来的图,不一样了。它有品味,审美是真的强。\n\n比如这张,K-POP女团第三张迷你专辑的概念海报。所有人都穿黑色系的造型,打光是侧逆光加柔焦,整体色调偏冷灰蓝,跟ECLIPSE(日食)的概念完全吻合。\n\n然后是一张信息量极大的图。生成一张Mariah Carey 90年代生涯图的中文信息长图。这种**大量信息 + 美感 + 准确性**的三角组合,说实话,以前只有比较不错的视觉设计师能做到。\n\n## 限制\n\nGPT-image-2对各类物品的精准度极高。**唯一可惜的就是,对亚洲人的一致性没有那么好。**\n\n## 写在最后\n\n2015年,知乎上有一个问题,叫「设计师这样的工作,可以做一辈子吗?」有一个叫大头帮主的人,写了一篇回答。里面有一段话,我可以倒背如流:\n\n> \"不要忘了,设计师,绝对,绝对不是画图员。设计师的最终价值,在于思辨。\"\n\nGPT-Image-2把画图这件事,彻底民主化了。但,画图从来都不是设计,画图是设计的执行层。**画图员的时代,确实结束了。但设计师的时代,才刚刚开始。**", "images": [ "images/zhihu-design-end-01.jpg", "images/zhihu-design-end-02.jpg", "images/zhihu-design-end-03.jpg" ], "channel": "zhihu", "url": "https://zhuanlan.zhihu.com/p/2030289550430447567", "feedback": { "view_count": null, "like_count": 386, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "我用 GPT-image2 跑了几十张图,总结出这套提示词写法", "description": "12+ 套高完成度提示词模板,**全部用 {城市名} / {主题} / {品牌名} 占位符**—prompt-as-code 模式最实操的样本;直接抄就能改成自己 skill / 工作流的 template 库。", "cover": "images/zhihu-fanmili-prompt-02.jpg", "author": "饭米粒", "body": "## 共性结构\n\n每个 prompt 都包含: **风格关键词 + 构图描述 + 主题占位符 + 文字落款 + 排除清单 + 比例参数**\n\n## 模板 1: 极简新中式美学纸艺剪影\n\n极简新中式美学风格,画面以淡雅的灰白色为底,呈现出一种纸艺剪影般的立体感。一条S形蜿蜒的裂痕状边缘将画面分割,仿佛撕开了一层纸面,露出内部色彩斑斓的东方山水景象。裂口内,一条蜿蜒的河流自上而下贯穿整个构图,河水以深浅不一的蓝色渲染,层次分明,仿佛流动的丝带。河岸两侧点缀着青翠的山丘与梯田,色彩柔和。沿河而建的古风建筑错落有致,飞檐翘角,白墙黛瓦。整体构图呈S形曲线,富有韵律感。画作边缘采用撕纸效果,营造出立体浮雕般的视觉体验。下方题字\"东方美学\"以黑色楷体书写,日期\"2026/04/20\"与红色印章相呼应,底部\"CHINA\"字样庄重醒目,署名\"@饭米粒\"低调收尾。\n\n## 模板 2: 涂鸦速写风\n\n以涂鸦速写风表现【主题】,整体呈现快速勾勒、自由变形、即兴手绘与草稿式的视觉效果。线条随手、夸张、可粗细不一,略显凌乱但具有节奏和表现力。颜色采用粗糙、干刷感明显的块面表现,可保留不均匀的涂抹痕迹、刷痕、飞白与覆盖感。**不要透明水彩晕染效果,不要细腻水彩过渡,不要纸纹理,不要柔和雾化,不要梦幻质感。** 背景以留白为主。\n\n## 模板 3: 宋代工笔画\n\n画一张能代表【意境】的宋代工笔画。加上文字和解释。\n\n## 模板 4: 山水诗意月夜\n\n宋代山水意境的中式国风插画,细腻的水墨勾线与柔和矿物颜料设色,银色月光洒落并映照水面,整体以浅蓝、青玉色为主调,点缀柔和粉色花枝。近景特写:一位年轻女子倚坐在木窗边,安静地望向窗外月下流动的江河。\n\n## 模板 5: 山海经神兽图鉴\n\n根据用户输入的山海经原文内容,理解后生成一幅基于《山海经》原文的高完成度东方上古神话中国风画像。【美术风格】中国风 × 上古神话美学 × 山海经异兽图鉴 × 工笔重彩 × 水墨晕染 × 宣纸/丝绢质感【禁止事项】避免西方奇幻、赛博朋克、科幻、现代元素、二次元、Q版、漫画风、印刷字体、主体不完整等\n\n## 模板 6: 微缩故事书写\n\n创建一幅具有大师级审美、电影海报质感与魔幻现实主义气息的竖幅画面。画面中,一只巨大的手握着一支巨大的复古钢笔,在一张无尽延展、具有真实纸张纤维质感的纸面上书写。随着墨水流动,书中的故事从纸面上自然生长出来。\n\n## 模板 7: 手绘城市美食地图\n\n一张手绘风格的城市美食地图,以【城市名】为主题。画面以鸟瞰视角的手绘简化城市地图为底,标注主要道路和地标。地图上分布着 12 个美食地点的精致手绘小插画,每个插画约占地图的 5% 面积,旁边用手写体标注店名和一句推荐语。\n\n## 模板 8: 儿童旅行手账\n\n请绘制一张色彩鲜艳、竖版(9:16)手绘风格的《【城市名】旅行手账插画》,画风仿佛由一位充满好奇心的孩子用蜡笔创作。\n\n## 模板 9: 皮克斯 3D 自拍\n\n皮克斯风格3D动画场景——【人物组合】在【场景环境】中欢乐自拍留念。【主视角人物】站在中央,手持自拍杆。\n\n## 模板 10: 3D 柴犬迷你品牌店\n\n【品牌名称】的3D柴犬风格迷你概念店,其外观创意灵感源自该品牌最具标志性的产品或包装。该店铺有两层,大玻璃窗清晰展示出温馨且精心装饰的内部。采用Cinema 4D以迷你城市景观风格渲染。\n\n## 模板 11: 3D 微缩公司股市图\n\n呈现一个精致的、微型的3D卡通风格场景,对应于用户指定的公司名称或股票代码,从45°俯视角度清晰可见。创造性地将公司在用户指定日期的实时股市数据融入场景。\n\n## 模板 12: 3D Q 版城市天气场景\n\n以清晰的45°俯视视角,呈现城市地标的 3D Q版微缩场景。请你先联网查询对应城市今天的真实天气信息。\n\n## 关键洞察\n\nimage2 这波火起来,我最大的感受不是\"AI 又变强了\",而是普通人做内容的门槛又低了一点。**真正有价值的是:你能不能把 AI 用进自己的内容、产品、工作流里,持续产出东西。**", "images": [ "images/zhihu-fanmili-prompt-01.jpg", "images/zhihu-fanmili-prompt-02.jpg" ], "channel": "zhihu", "url": "https://zhuanlan.zhihu.com/p/2031442418898352084", "feedback": { "view_count": null, "like_count": 288, "comment_count": null, "collect_count": null, "share_count": null }, "note": "原文给了 12 套完整 prompt + 30+ 张配图,精选 2 张工笔山水代表图" }, { "title": "今天GPT image 2 发布了,它和 nano banana pro 比怎么样?", "description": "短答案,但金句够硬: \"GPT 的手部还是会经常出错,2026 年了,这是不应该的。**画面要素一多,手部崩的概率 100%**。以后可以看手辨 GPT。\" — 是\"不适合做什么\"最直接的硬证据。", "cover": "", "author": "知乎答主", "body": "肯定比Nano Banana Pro要更好,尤其是在平面设计上,是完爆香蕉。\n\n在幻觉控制上,香蕉Pro > GPT > 香蕉2\n\n版权IP,名人脸还原度:GPT > 香蕉Pro > 香蕉2\n\n画面细节:GPT > 香蕉Pro > 香蕉2\n\n**GPT的手部还是会经常出错,都 2026年了,这是不应该的。画面要素一多,手部崩的概率100%。以后可以看手辨GPT。**\n\n香蕉2的幻觉,主要是画面加入很多额外要素,感觉像抠图没抠干净。\n\n这次GPT补上了P图短板,直接成新王了。\n\n这次,这个模型看起来很大,**出图速度明显比 Nano banana 2 更慢**。这种满血大模型,现在上手手玩是最好的。也是ChatGPT plus最划算的Moment,这种超值机遇一年中出现不了几次。", "images": [], "channel": "zhihu", "url": "https://www.zhihu.com/question/2030252715255800778/answer/2030357256903071032", "feedback": { "view_count": null, "like_count": 145, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "GPT image2.0高清化展板教程", "description": "**社区对单图最大 ~3840×2160 硬限制最巧的工程绕路**: 切块→分别精修→PS 拼回→混合带融合,做出竞赛级建筑/景观/规划展板;3907 赞验证有效,是\"超分辨率\"工作流必收。", "cover": "images/xhs-board-hires-01.jpg", "author": "小红书博主", "body": "核心思路就是:**先切块,再高清,最后拼回去。**\n\n1️⃣ **先按展板逻辑切分板块**\n\n比如标题区、分析图、效果图、图表区、文字区。想保留越多细节,就切得越小。\n\n2️⃣ **把切好的板块丢进 ChatGPT**\n\n告诉它:\n\n> \"这是竞赛展板,请分别生成高清版本,文字清晰一点,图片清晰一点,**风格不变,尺寸不变**。\"\n\n3️⃣ **用 PS 拼回原版面**\n\n如果背景色有差异,可以双击图层,进入\"图层样式\",用\"混合带\"稍微融合一下。\n\n4️⃣ **最后手动微调**\n\n文字、边缘、色差、对齐这些细节,自己再修一下就可以。\n\nps:小红书降低画质了真的超级清晰哦,可以直接看最后一张手机截图\n\n#GPTImage2[话题]# #AI设计[话题]# #设计展板[话题]# #建筑展板[话题]# #景观展板[话题]# #规划展板[话题]# #作品集排版[话题]# #竞赛展板[话题]# #设计效率[话题]# #小红书文案[话题]#", "images": [ "images/xhs-board-hires-01.jpg" ], "channel": "xhs", "url": "https://www.xiaohongshu.com/explore/69eb3c58000000001f031002", "feedback": { "view_count": null, "like_count": 3907, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "GPT-Image2 从 0-1 实操手册!附100+提示词", "description": "新手向入门手册的代表样本;不在于内容多深,而在于反映了一个事实—image-2 受益于\"prompt 库\"心智(vs 扩散模型时代靠 weight tuning),社区已经在大规模复用模板。", "cover": "images/xhs-handbook-01.jpg", "author": "小红书博主", "body": "新手如何用image生图,这些提示词复制即用!\n\n#chatgpt[话题]# #gpt[话题]# #AI生成[话题]# #AI工具[话题]# #ai关键词[话题]# #ai[话题]# #ai绘画[话题]# #ai生图[话题]#", "images": [ "images/xhs-handbook-01.jpg" ], "channel": "xhs", "url": "https://www.xiaohongshu.com/explore/69eb3850000000002301c42b", "feedback": { "view_count": null, "like_count": 343, "comment_count": null, "collect_count": null, "share_count": null }, "note": "原文主要价值在配图(100+ 提示词截图),正文很短" }, { "title": "GPT Image 2生成图片出现奇怪纹路?", "description": "**唯一明确反映 image-2 偶发质量问题的真实用户帖**—多边形纹路 / 降噪不彻底,作者明确说\"并非个例\";写 skill description 的 caveat 段或做 QA 时,这是必引证据。", "cover": "images/xhs-noise-01.jpg", "author": "小红书博主", "body": "感觉这两天gpt image 2生成的很多图片怪怪的,而且**并非个例**。仔细观察图片会发现有很多多边形纹路,是生成的时候降噪不彻底…?\n\n#AI画图[话题]# #人工智障与人工智能[话题]# #ai[话题]#", "images": [ "images/xhs-noise-01.jpg" ], "channel": "xhs", "url": "https://www.xiaohongshu.com/explore/69ecda4d0000000035025f26", "feedback": { "view_count": null, "like_count": 205, "comment_count": null, "collect_count": null, "share_count": null }, "note": "" }, { "title": "EvoLinkAI/awesome-gpt-image-2-prompts", "description": "GitHub 5786⭐,目前 image-2 关键词下 star 最高的 prompts 库;按用例分类,**直接接进 agent / skill 的 template fallback 最优选**。", "cover": "", "author": "EvoLinkAI", "body": "Curated GPT-Image-2 prompts for the OpenAI API: image examples across portraits, posters, UI mockups, photorealism, infographics, etc.\n\n## 仓库定位\n- 模型刚发布 6 天即累积 5786⭐,验证 image-2 受益于 prompt 工程化生态\n- 按用例分类组织,方便 routing 时按场景取模板", "images": [], "channel": "github", "url": "https://github.com/EvoLinkAI/awesome-gpt-image-2-prompts", "feedback": { "view_count": null, "like_count": 5786, "comment_count": null, "collect_count": null, "share_count": null }, "note": "通过 content-search github keyword=gpt-image-2 找到 (index 4)" }, { "title": "freestylefly/awesome-gpt-image-2", "description": "**Prompt as Code 工业级模板库**—13 套模板 + 329 案例逆向工程,口号清晰;比同类\"prompt 大全\"更进一步,把 prompt 工程化为模板化代码,可直接做企业内部 routing。", "cover": "", "author": "freestylefly", "body": "**Prompt as Code | GPT-Image2 工业级提示词引擎与模板库**\n\n- 329 个案例逆向工程\n- 13 套工业级模板\n\n## 关键差异\n相比同期纯收集 prompts 的仓库,本仓库把 prompt 抽象为可参数化的模板,适合企业内部沉淀。", "images": [], "channel": "github", "url": "https://github.com/freestylefly/awesome-gpt-image-2", "feedback": { "view_count": null, "like_count": 917, "comment_count": null, "collect_count": null, "share_count": null }, "note": "通过 content-search github keyword=gpt-image-2 找到 (index 9)" }, { "title": "YouMind-OpenLab/awesome-gpt-image-2", "description": "号称\"全球最大\"的 image-2 prompt 库,2000+ 每日更新;反映 image-2 在中文创作社区的活跃度,是\"哪些用例正在被反复尝试\"的低成本扫描入口。", "cover": "", "author": "YouMind-OpenLab", "body": "🚀 **World's largest GPT Image 2 prompt library, updated daily — 2000+ curated prompts**\n\n模型发布 6 天即收集 3039⭐, 是 image-2 prompt 生态繁荣度的硬指标。", "images": [], "channel": "github", "url": "https://github.com/YouMind-OpenLab/awesome-gpt-image-2", "feedback": { "view_count": null, "like_count": 3039, "comment_count": null, "collect_count": null, "share_count": null }, "note": "通过 content-search github keyword=gpt-image-2 找到 (index 6)" }, { "title": "GPT-Image-2 claimed the #1 spot across all Image Arena leaderboards", "description": "LM Arena 独立 ranking 的官方公告,**给了三类任务的具体得分差**(Text-to-Image +242 / Single-Image Edit +125 / Multi-Image Edit +90),是\"image-2 是否值得设为 default\"的最权威量化锚点。", "cover": "", "author": "@arena (LMArena)", "body": "Exciting news - GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards!\n\nA clean sweep with a record-breaking +242 point lead in Text-to-Image - the largest gap we've seen to date.\n\n- **#1 Text-to-Image (1512), +242 over #2** (Nano-banana-2 with web-search aka gemini-3.1-flash-image)\n- **#1 Single-Image Edit (1513), +125 over #2** (Nano-banana-pro aka gemini-3-pro-image)\n- **#1 Multi-Image Edit (1464), +90 over #2** (Nano-banana-2)\n\nNo model has dominated Image Arena with margins this wide.\n\nHuge congratulations to @OpenAI on this major breakthrough in image generation! More performance breakdowns by category in the thread below.", "images": [], "channel": "x", "url": "https://x.com/arena/status/2046670703311884548", "feedback": { "view_count": 2157943, "like_count": 5721, "comment_count": 207, "collect_count": 1118, "share_count": 635 }, "note": "X 原 tweet 应有图但 detail 接口未返回 image_url_list (工具仅解析 video 类 thumbnail)" }, { "title": "360° equirectangular panorama with GPT Image 2", "description": "X 4393 赞 **未被官方文档提及的 use case**: 用 image-2 直出 equirectangular 360° 全景图(可塞进前端 360 viewer 沉浸式浏览);Happycapy skill 接入,view 量 32 万说明 use case 共鸣度高。", "cover": "images/x-fminzhou-360-01.jpg", "author": "@fMinZhou", "body": "GPT Image 2 is insanely good...I generated a **360° equirectangular panorama** in Happycapy with just a skill + prompt.\n\n**Step 1**: Select the generate-image skill\n\n**Step 2**: Enter a prompt like:\n\n> \"Use a frontend 360 viewer to display an equirectangular image of […] using the GPT-Image-2 model.\"\n\nWanna see how you all get creative with this", "images": [ "images/x-fminzhou-360-01.jpg" ], "channel": "x", "url": "https://x.com/fMinZhou/status/2047214663721681288", "feedback": { "view_count": 327758, "like_count": 4393, "comment_count": 65, "collect_count": 3781, "share_count": 482 }, "note": "封面是 video thumbnail,原 tweet 含 45s 演示视频" }, { "title": "Making a commercial kimchi ad using GPT Image 2 x Seedance 2.0", "description": "X 2401 赞 **GPT Image 2 + Seedance 2.0 商业广告 case**: 韩国泡菜商品广告完整管线;view 量 28 万、collect 2360 — 是创作者最想 save 的 combo workflow 样本之一。", "cover": "images/x-dstudio-kimchi-01.jpg", "author": "@D_studioproject", "body": "Making a **commercial kimchi ad** using GPT Image 2 x Seedance 2.0\n\nworkflow ⬇️\n\n## 工作流(视频内演示)\n1. GPT Image 2 出广告关键帧(产品 + 场景 + 文字)\n2. 把帧作为 reference 传给 Seedance 2.0\n3. Seedance 输出 15 秒级商业广告视频", "images": [ "images/x-dstudio-kimchi-01.jpg" ], "channel": "x", "url": "https://x.com/D_studioproject/status/2048119090376769550", "feedback": { "view_count": 287129, "like_count": 2401, "comment_count": 39, "collect_count": 2360, "share_count": 293 }, "note": "封面是 video thumbnail,原 tweet 含 15s 商业广告视频" }, { "title": "Crocs 高端超现实广告海报 - GPT Image 2 完整 prompt", "description": "**完整可复用的高端商业海报 prompt**(Crocs 单色蓝调摄影棚 + 大字 logo + tagline 排版),覆盖 prompting 最佳实践的所有要素: 场景 → 主体 → 字体 → 文字内容 → 灯光 → 比例;直接抄就能改成自己品牌的版本。", "cover": "", "author": "@rovvmut_", "body": "GPT Image 2 on ChatGPT app.\n\n**Prompt:**\n\nA high-fashion surrealist advertising poster for Crocs. The scene is set in a minimalist, monochrome light blue studio with a semi-reflective floor.\n\nThe central focus is an **oversized, giant white Croc clog positioned on its heel at a diagonal angle, serving as a backrest**. A fashion model with long dark hair, dressed in a clean, all-white coordinated sweatshirt and wide-leg trousers, leans her entire back against the giant shoe in a relaxed, leaning posture. She is facing right in profile, looking ahead with a serene expression, and wearing standard-sized white Crocs.\n\nIn the background, the word **\"CROCS\"** is written in massive, bold, white condensed sans-serif typography, partially occluded by the giant shoe and the model to create a sense of depth. At the top right, **\"Designed with ChatGPT\"**.\n\nAt the bottom center, a white sans-serif tagline reads: **\"Made for comfort, worn for confidence. Because life feels better when your feet stop complaining.\"**\n\nThe lighting is soft, cool, and even, casting gentle shadows and a soft reflection of the subjects on the glossy blue floor. The overall aesthetic is clean, modern, and high-concept.\n\nMake the aspect ratio 3:4", "images": [], "channel": "x", "url": "https://x.com/rovvmut_/status/2047629543608025121", "feedback": { "view_count": 123398, "like_count": 2013, "comment_count": 87, "collect_count": 1969, "share_count": 232 }, "note": "X 原 tweet 应有效果图但 detail 接口未返回 (工具仅解析 video 类 thumbnail);prompt 本身已覆盖最佳实践全要素" }, { "title": "如何用 GPT-image-2 生成 100% 真实感的 AI 照片(去 AI 感工作流)", "description": "**最完整的去\"AI 感\"工作流**: Pinterest 灵感图 → Claude Opus 分析出 JSON prompt → image-2 出图 + 角色一致性;答\"image-2 默认有 AI 感、怎么破\"的具体方案;collect 1564 远高于 like(866),说明被大量 save 作 reference。", "cover": "", "author": "@Mho_23", "body": "**how to generate AI photos that look 100% real with GPT-image-2:**\n\nout of the box GPT image 2 has a very noticeable AI look when you don't use references, which is why we are going to be using inspiration photos for color grading and creative direction\n\n> **go on pinterest** and grab any photo you like that has the aesthetic you want\n\n> **upload that pinterest photo into claude opus 4.7** and ask for this:\n\n> \"analyze this photo and give me a very detailed json prompt that can recreate it, it should be very detailed, really make sure to break down the color grading and all the exact colors in the photo\"\n\nthe reason we're using **claude opus** is because of its very strong visual analysis capabilities + its less lazy than gemini when writing json code\n\nwe're not using claude sonnet because it doesn't do actual proper photo analysis, which is important for good json with this method\n\n> take the json prompt claude gives you and put it into ChatGPT\n\n> **upload your product image and prompt:** \"using this json as reference, generate a person holding my product [attach product image]\"\n\n> GPT-images-2 generates the photo with that person holding your product\n\nyou will notice the photo looks more realistic than usual and the color grading will be really good\n\n> save that generated character photo as your reference\n\n> **for future generations, attach the character photo each time for facial consistency**\n\nnow you have a consistent ugc model that can hold different products across multiple photos\n\nyou can also iterate the base json prompt and chat back and forth with claude to create a json prompt based on what you want, then repeat the same process..\n\ni like this because it makes the color grading really good in the photo\n\n**the #1 sign a photo is generated with chatgpt is the basic colours and grainy look, but with this method it removes that**\n\nthis is how you can build a library of high quality ugc content without hiring creators\n\nthe json from claude handles the realistic lighting and color grading, images-2 handles the character consistency, and you control the product placement\n\ntakes 5 minutes to set up, then you can generate unlimited variations", "images": [], "channel": "x", "url": "https://x.com/Mho_23/status/2046737511666389260", "feedback": { "view_count": 65165, "like_count": 866, "comment_count": 26, "collect_count": 1564, "share_count": 63 }, "note": "X 原 tweet 含视频演示但 detail 未返回 image_url_list;collect_count(1564) > like_count(866) 说明高被收藏价值" }, { "title": "LOST IN JAPAN 高端旅行海报 - GPT Image 2 完整 prompt", "description": "**带'风格一致性约束'的旅行海报 prompt 模板**(强调跟另一张 Turkey 海报使用相同 color theme/lighting/layout);展示 image-2 处理\"系列海报跨张视觉一致\"的高级 prompting 技巧—对于做品牌/系列内容的人是直接可改模板。", "cover": "", "author": "@chatgptpaglu", "body": "GPT Image 2 on @yapper_so\n\n**Prompt below 👇**\n\nCreate a premium stylized travel poster / graphic collage for **JAPAN**.\n\nThe main subject MUST be a stylish international female tourist (young woman, feminine face, long hair, slim body). She must clearly look female, not male or gender-neutral. Use a completely unique face identity.\n\nShe is wearing modern travel fashion: linen shirt, neutral tones outfit, sunglasses, backpack, holding a camera. She looks confident and exploring.\n\n**IMPORTANT**: Use the **SAME color theme and visual style** as a cinematic Turkey travel poster: warm golden tones, teal accents, beige vintage paper textures, soft sunset lighting, slightly desaturated but rich colors. Keep the exact same mood, lighting, and color grading.\n\n**Keep layout and composition consistent with a travel poster series**: central character, layered collage background, balanced editorial design.\n\nPlace her in a dynamic collage composition surrounded by Japanese elements: Mount Fuji, Tokyo city skyline, cherry blossoms, pagoda temple, neon street signs, ramen, sushi.\n\n**Apply SAME collage style**: torn paper edges, halftone dots, vintage textures, travel stamps, sticker elements.\n\nInclude:\n- \"ARRIVED TOKYO\" stamp\n- direction board (Shibuya, Kyoto, Osaka)\n- Japanese stamp style badge\n- minimal decorative icons\n\nAdd big headline: **\"LOST IN JAPAN\"**\n\nAdd small text:\n- \"Where culture meets adventure\"\n- \"Not all those who wander are lost\"\n\nLighting: cinematic warm glow, dramatic shadows (same as Turkey version)\nStyle: editorial magazine collage, ultra detailed, 4K, print ready\n\n**IMPORTANT**: match the color grading, tones, and lighting exactly with the poster style", "images": [], "channel": "x", "url": "https://x.com/chatgptpaglu/status/2048623805334159525", "feedback": { "view_count": 1238, "like_count": 57, "comment_count": 34, "collect_count": 15, "share_count": 5 }, "note": "X 原 tweet 应有效果图但 detail 接口未返回;prompt 价值在'跟另一张参考海报保持视觉一致'的 cross-image 风格 lock 技巧" } ]