Google Gemini Image Format (Image)¶
Official Documentation
📝 Introduction¶
Given a text prompt, the model will generate new images. Google Gemini provides powerful image generation models that can create images based on natural language descriptions. Currently supported models include:
| Model | Description |
|---|---|
| gemini-2.5-flash-image | Google Gemini image generation model that supports generating high-quality images based on text prompts |
| gemini-3-pro-preview | Google Gemini image generation and editing model that supports generating images based on text prompts, as well as editing images based on input images and text prompts |
💡 Request Examples¶
Create Image ✅¶
# Basic image generation
curl "https://computevault.unodetech.xyz/v1beta/models/gemini-2.5-flash-image:generateContent?key=$API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Give me an image of a cat"}]
}]
}'
Response Example:
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is an image for you: "
},
{
"inlineData": {
"mimeType": "image/png",
"data": "..."
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 9,
"candidatesTokenCount": 1298,
"totalTokenCount": 1307,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 9
}
],
"candidatesTokensDetails": [
{
"modality": "IMAGE",
"tokenCount": 1290
}
]
},
"modelVersion": "gemini-2.5-flash-image",
"responseId": "..."
}
Edit Image ✅¶
# Image editing
curl "https://computevault.unodetech.xyz/v1beta/models/gemini-2.5-flash-image:generateContent?key=$API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Create a picture of my cat eating a nano-banana in a fancy restaurant under the Gemini constellation"},
{
"inline_data": {
"mime_type": "image/jpeg",
"data": "$IMG_BASE64"
}
}
]
}]
}'
Response Example:
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is the edited image for you: "
},
{
"inlineData": {
"mimeType": "image/png",
"data": "..."
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 15,
"candidatesTokenCount": 1350,
"totalTokenCount": 1365,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 15
},
{
"modality": "IMAGE",
"tokenCount": 1200
}
],
"candidatesTokensDetails": [
{
"modality": "IMAGE",
"tokenCount": 1350
}
]
},
"modelVersion": "gemini-2.5-flash-image",
"responseId": "..."
}
📮 Request¶
Endpoint¶
Create Image¶
Create an image based on a text prompt.
Edit Image¶
Edit or generate new images based on input images and text prompts. Supports gemini-2.5-flash-image and gemini-3-pro-preview models.
Authentication Method¶
Include the API key in the request URL parameters:
Where $API_KEY is your API key.
Request Body Parameters¶
contents¶
- Type: Array
- Required: Yes
- Description: Array containing the content for the generation request.
Content Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
parts |
Array | Yes | Ordered content parts that constitute a single message |
Part Object Properties:
| Property | Type | Required | Description |
|---|---|---|---|
text |
String | Yes (for create image) | Text description of the desired image |
inline_data |
Object | Yes (for edit image) | Input image data (for image editing) |
text (in parts)¶
- Type: String
- Required: Yes (required for create image, also required for edit image)
- Description: Text description of the desired image.
- Tips:
- Use specific and detailed descriptions
- Include key visual elements
- Specify the desired artistic style
- Describe composition and perspective
- When editing images, describe how you want to modify the input image
inline_data (in parts, for image editing)¶
- Type: Object
- Required: Yes (for edit image)
- Description: Input image data to be edited.
InlineData Object Properties (in request):
| Property | Type | Required | Description |
|---|---|---|---|
mime_type |
String | Yes | MIME type of the image (e.g., "image/jpeg", "image/png") |
data |
String | Yes | Base64-encoded image data |
📥 Response¶
Success Response¶
candidates¶
- Type: Array
- Description: List of candidate responses from the model
Candidate Object Properties:
| Property | Type | Description |
|---|---|---|
content |
Object | Generated content returned by the model |
finishReason |
Enum | Reason why the model stopped generating |
index |
Integer | Index of the candidate in the response candidate list |
Content Object Properties:
| Property | Type | Description |
|---|---|---|
parts |
Array | Generated content parts, which may include text and images |
role |
String | Producer of the content, usually "model" |
Part Object Properties:
| Property | Type | Description |
|---|---|---|
text |
String | Text content (optional, may include descriptive text) |
inlineData |
Object | Generated image data (optional) |
InlineData Object Properties:
| Property | Type | Description |
|---|---|---|
mimeType |
String | MIME type of the image (e.g., "image/png") |
data |
String | Base64-encoded image data |
FinishReason Enum Values:
STOP: Natural stopping point of the modelMAX_TOKENS: Maximum token count specified in the request has been reachedSAFETY: The system has flagged the response candidate content for safety reasonsIMAGE_SAFETY: Token generation has stopped because the generated image violates safety regulationsOTHER: Unknown reason
usageMetadata¶
- Type: Object
- Description: Metadata about token usage for the generation request
UsageMetadata Object Properties:
| Property | Type | Description |
|---|---|---|
promptTokenCount |
Integer | Number of tokens in the prompt |
candidatesTokenCount |
Integer | Total number of tokens in all generated candidate responses |
totalTokenCount |
Integer | Total number of tokens for the generation request |
promptTokensDetails |
Array | List of modalities processed in the request input |
candidatesTokensDetails |
Array | List of modalities returned in the response |
candidatesTokensDetails Object Properties:
| Property | Type | Description |
|---|---|---|
modality |
Enum | Modality associated with this token count (TEXT, IMAGE, etc.) |
tokenCount |
Integer | Number of tokens |
modelVersion¶
- Type: String
- Description: Model version used to generate the response
responseId¶
- Type: String
- Description: ID used to identify each response
promptFeedback (Optional)¶
- Type: Object
- Description: Prompt feedback related to content filters
Image Object Example¶
🌟 Best Practices¶
Prompt Writing Suggestions¶
- Use clear and specific descriptions
- Specify important visual details
- Describe the desired artistic style and atmosphere
- Pay attention to composition and perspective descriptions
- Can include details such as color, lighting, mood, etc.
Parameter Selection Suggestions¶
-
Model Selection
- gemini-2.5-flash-image: Suitable for quickly generating high-quality images
- gemini-3-pro-preview: Supports image generation and editing, suitable for scenarios requiring editing based on existing images
-
Prompt Optimization
- Use detailed and descriptive text
- Include specific visual elements and style requirements
- Avoid vague or overly brief descriptions
- When editing images, clearly describe how you want to modify the input image (add, remove, replace elements, etc.)
-
Safety Settings
- Adjust safety thresholds according to application scenarios
Common Issues¶
-
Image Generation Failure
- Check if the prompt complies with content policies
- Verify API key permissions
- Confirm the request format is correct
-
Results Don't Match Expectations
- Optimize prompt descriptions to be more specific and detailed
- Add more visual details and style descriptions
- Try different description approaches
-
Safety Filter Issues
- Modify the prompt to avoid triggering safety filters