Skip to main content

ImageContent

The Image class handles image data for multimodal AI interactions.

Import

from openstackai.multimodal import Image

Creating Images

From File

image = Image.from_file("photo.jpg")

From URL

image = Image.from_url("https://example.com/image.png")

From Bytes

with open("image.png", "rb") as f:
image = Image.from_bytes(f.read(), media_type="image/png")

From Base64

image = Image.from_base64(
base64_string,
media_type="image/jpeg"
)

From PIL Image

from PIL import Image as PILImage

pil_img = PILImage.open("photo.jpg")
image = Image.from_pil(pil_img)

Properties

PropertyTypeDescription
widthintImage width in pixels
heightintImage height in pixels
media_typestrMIME type (image/jpeg, etc.)
size_bytesintFile size in bytes
formatstrImage format (png, jpeg, etc.)

Methods

resize()

Resize image while maintaining aspect ratio:

# Resize to max dimensions
resized = image.resize(max_width=1024, max_height=1024)

# Resize to specific size
resized = image.resize(width=800, height=600)

# Scale by percentage
resized = image.resize(scale=0.5) # 50% size

convert()

Convert to different format:

# Convert to JPEG
jpeg_image = image.convert(format="jpeg", quality=85)

# Convert to PNG
png_image = image.convert(format="png")

# Convert to WebP
webp_image = image.convert(format="webp", quality=80)

crop()

Crop image region:

cropped = image.crop(
left=100,
top=50,
width=400,
height=300
)

to_base64()

Get base64 encoded string:

b64_string = image.to_base64()

save()

Save to file:

image.save("output.png")
image.save("output.jpg", format="jpeg", quality=90)

Using with Agents

Single Image

from openstackai import ask
from openstackai.multimodal import Image

image = Image.from_file("chart.png")
response = ask("Explain this chart", images=[image])

Multiple Images

images = [
Image.from_file("img1.jpg"),
Image.from_file("img2.jpg"),
Image.from_file("img3.jpg")
]

response = ask(
"What do these images have in common?",
images=images
)

With Agent

from openstackai import Agent
from openstackai.multimodal import Image

agent = Agent(
name="Analyst",
model="gpt-4o" # Vision model
)

image = Image.from_url("https://example.com/data.png")
result = agent.run("Analyze this data visualization", images=[image])

Format Support

FormatReadWriteNotes
JPEGMost efficient for photos
PNGBest for graphics/screenshots
GIFAnimated GIFs supported
WebPGood compression
BMPUncompressed
TIFFHigh quality

Provider Formats

Different providers accept different formats:

# OpenAI format
openai_content = image.to_openai_format()

# Anthropic format
anthropic_content = image.to_anthropic_format()

# Auto-detect (used internally)
provider_content = image.to_provider_format(provider="openai")

See Also