Google Gemini Image Format (Image)¶
Official Documentation
📝 Introduction¶
Given a text prompt, the model will generate new images. Google Gemini provides powerful image generation models that can create images based on natural language descriptions.
🤖 Supported Models¶
Currently supported models include:
| Model | Description |
|---|---|
| gemini-2.5-flash-image | Google Gemini image generation model that supports generating high-quality images based on text prompts |
| gemini-3-pro-image-preview | Google Gemini image generation and editing model that supports generating images based on text prompts, as well as editing images based on input images and text prompts |
💡 Request Examples¶
Create Image ✅¶
# Basic image generation
curl "https://computevault.unodetech.xyz/v1beta/models/gemini-2.5-flash-image:generateContent?key=$API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Give me an image of a cat"}]
}]
}'
Response Example:
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is an image for you: "
},
{
"inlineData": {
"mimeType": "image/png",
"data": "..."
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 9,
"candidatesTokenCount": 1298,
"totalTokenCount": 1307,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 9
}
],
"candidatesTokensDetails": [
{
"modality": "IMAGE",
"tokenCount": 1290
}
]
},
"modelVersion": "gemini-2.5-flash-image",
"responseId": "..."
}
Edit Image ✅¶
# Image editing
curl "https://computevault.unodetech.xyz/v1beta/models/gemini-3-pro-image-preview:generateContent?key=$API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "put the chips in the given image on a beach."},
{
"inline_data": {
"mime_type": "image/jpeg",
"data": "$IMG_BASE64"
}
}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "1:1",
"imageSize": "2K"
}
}
}'
Response Example:
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is the edited image for you: "
},
{
"inlineData": {
"mimeType": "image/png",
"data": "..."
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 15,
"candidatesTokenCount": 1350,
"totalTokenCount": 1365,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 15
},
{
"modality": "IMAGE",
"tokenCount": 1200
}
],
"candidatesTokensDetails": [
{
"modality": "IMAGE",
"tokenCount": 1350
}
]
},
"modelVersion": "gemini-3-pro-image-preview",
"responseId": "..."
}
📮 Request¶
Endpoint¶
Create Image¶
Create an image based on a text prompt.
Edit Image¶
Edit or generate new images based on input images and text prompts. Supports gemini-2.5-flash-image and gemini-3-pro-image-preview models.
Authentication Method¶
Include the API key in the request URL parameters:
Where $API_KEY is your API key.
Request Body Parameters¶
contents¶
- Type: Array
- Required: Yes
- Description: Array containing the content for the generation request.
Content Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
parts |
Array | Yes | Ordered content parts that constitute a single message |
Part Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
text |
String | Yes (for create image) | Text description of the desired image |
inline_data |
Object | Yes (for edit image) | Input image data (for image editing) |
text (in parts)¶
- Type: String
- Required: Yes (required for create image, also required for edit image)
- Description: Text description of the desired image.
- Tips:
- Use specific and detailed descriptions
- Include key visual elements
- Specify the desired artistic style
- Describe composition and perspective
- When editing images, describe how you want to modify the input image
inline_data (in parts, for image editing)¶
- Type: Object
- Required: Yes (for edit image)
- Description: Input image data to be edited.
InlineData Object Properties (in request):
| Property | Type | Required | Description |
|---|---|---|---|
mime_type |
String | Yes | MIME type of the image (e.g., "image/jpeg", "image/png") |
data |
String | Yes | Base64-encoded image data |
generationConfig (Optional)¶
- Type: Object
- Required: No
- Description: Configuration parameters for controlling generation behavior.
GenerationConfig Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
responseModalities |
Array | No | Specifies the modality types that should be included in the response, such as ["TEXT", "IMAGE"] |
imageConfig |
Object | No | Image generation configuration parameters |
ImageConfig Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
aspectRatio |
String | No | Aspect ratio of the image, such as "1:1", "16:9", etc. |
imageSize |
String | No | Image size, such as "2K", "4K", etc. |
📥 Response¶
Success Response¶
candidates¶
- Type: Array
- Description: List of candidate responses from the model
Candidate Object Properties:
| Property | Type | Description |
|---|---|---|
content |
Object | Generated content returned by the model |
finishReason |
Enum | Reason why the model stopped generating |
index |
Integer | Index of the candidate in the response candidate list |
Content Object Properties:
| Property | Type | Description |
|---|---|---|
parts |
Array | Generated content parts, which may include text and images |
role |
String | Producer of the content, usually "model" |
Part Object Properties:
| Property | Type | Description |
|---|---|---|
text |
String | Text content (optional, may include descriptive text) |
inlineData |
Object | Generated image data (optional) |
InlineData Object Properties:
| Property | Type | Description |
|---|---|---|
mimeType |
String | MIME type of the image (e.g., "image/png") |
data |
String | Base64-encoded image data |
FinishReason Enum Values:
STOP: Natural stopping point of the modelMAX_TOKENS: Maximum token count specified in the request has been reachedSAFETY: The system has flagged the response candidate content for safety reasonsIMAGE_SAFETY: Token generation has stopped because the generated image violates safety regulationsOTHER: Unknown reason
usageMetadata¶
- Type: Object
- Description: Metadata about token usage for the generation request
UsageMetadata Object Properties:
| Property | Type | Description |
|---|---|---|
promptTokenCount |
Integer | Number of tokens in the prompt |
candidatesTokenCount |
Integer | Total number of tokens in all generated candidate responses |
totalTokenCount |
Integer | Total number of tokens for the generation request |
promptTokensDetails |
Array | List of modalities processed in the request input |
candidatesTokensDetails |
Array | List of modalities returned in the response |
candidatesTokensDetails Object Properties:
| Property | Type | Description |
|---|---|---|
modality |
Enum | Modality associated with this token count (TEXT, IMAGE, etc.) |
tokenCount |
Integer | Number of tokens |
modelVersion¶
- Type: String
- Description: Model version used to generate the response
responseId¶
- Type: String
- Description: ID used to identify each response
promptFeedback (Optional)¶
- Type: Object
- Description: Prompt feedback related to content filters
Image Object Example¶
🌟 Best Practices¶
Prompt Writing Suggestions¶
- Use clear and specific descriptions
- Specify important visual details
- Describe the desired artistic style and atmosphere
- Pay attention to composition and perspective descriptions
- Can include details such as color, lighting, mood, etc.
Parameter Selection Suggestions¶
-
Model Selection
- gemini-2.5-flash-image: Suitable for quickly generating high-quality images
- gemini-3-pro-image-preview: Supports image generation and editing, suitable for scenarios requiring editing based on existing images
-
Prompt Optimization
- Use detailed and descriptive text
- Include specific visual elements and style requirements
- Avoid vague or overly brief descriptions
- When editing images, clearly describe how you want to modify the input image (add, remove, replace elements, etc.)
-
Safety Settings
- Adjust safety thresholds according to application scenarios
Common Issues¶
-
Image Generation Failure
- Check if the prompt complies with content policies
- Verify API key permissions
- Confirm the request format is correct
-
Results Don't Match Expectations
- Optimize prompt descriptions to be more specific and detailed
- Add more visual details and style descriptions
- Try different description approaches
-
Safety Filter Issues
- Modify the prompt to avoid triggering safety filters