Images

This component generates vector embeddings from images. It converts visual content into numerical representations that capture semantic meaning, enabling similarity search, clustering, and other operations on image data.

Inputs

  • Images – Image files to convert to embeddings
  • Documents – Document objects containing images

Outputs

  • Vectors – Generated vector embeddings
  • Documents – Original documents with embeddings attached

Configuration

Model Settings

  • Model – Image embedding model to use
    • Default – “clip-vit-base-patch32”
    • Available models – CLIP, ResNet, EfficientNet
      • Custom model – User-defined embedding engine, not configured in this view.
      • Google – 16×16 – Fast, accurate, general-purpose embeddings.
      • OpenAI – 16×16 – Good performance with lower memory usage.
      • OpenAI – 32×32 – Lower performance, better image recognition.
  • Dimensions – Vector dimensions
    • Default – 512
    • Note – Model dependent
  • Batch Size – Number of images to embed at once
    • Default – 16
    • Note – Affects memory usage
  • Normalize – Normalize vector lengths
    • Default – true
    • Note – Improves similarity calculations

Image Processing

  • Resize – Resize images before embedding
    • Default – true
    • Note – Ensures consistent input size
  • Target Size – Target image dimensions
    • Default – [224, 224]
    • Note – Width and height in pixels
  • Center Crop – Apply center cropping
    • Default – true
    • Note – Maintains aspect ratio
  • Color Mode – Color processing mode
    • Default – “RGB”
    • Options – RGB, grayscale

Advanced Settings

  • Cache – Cache embeddings for reuse
    • Default – true
    • Note – Improves performance for repeated images
  • Device – Processing device
    • Default – “auto”
    • Options – auto, cpu, cuda
  • Precision – Computation precision
    • Default – “float32”
    • Options – float32, float16, bfloat16

Basic Image Embedding

This example shows how to configure the Image Embedding component for basic image embedding:

json

{
"model": "clip-vit-base-patch32",
"dimensions": 512,
"batchSize": 16,
"normalize": true,
"resize": true,
"targetSize": [224, 224],
"centerCrop": true,
"colorMode": "RGB"
}

High-Performance Image Embedding

For high-performance image embedding with GPU acceleration:

json

{
"model": "clip-vit-large-patch14",
"dimensions": 768,
"batchSize": 32,
"normalize": true,
"resize": true,
"targetSize": [336, 336],
"centerCrop": true,
"colorMode": "RGB",
"cache": true,
"device": "cuda",
"precision": "float16"
}

Best Practices

Model Selection

  • Use CLIP models for general-purpose image embeddings
  • Use ResNet models for traditional computer vision tasks
  • Use EfficientNet for resource-constrained environments

Image Preparation

  • Ensure consistent image sizes through resizing
  • Consider image quality and resolution
  • Use center cropping to maintain important visual elements
  • Preprocess images to remove noise or irrelevant content

Performance Optimization

  • Adjust batch size based on available memory
  • Use GPU acceleration when available
  • Enable caching for repeated processing of the same images
  • Use lower precision (float16) for faster processing with minimal quality loss

Troubleshooting

Processing Problems

  • Out of memory errors – Reduce batch size or image dimensions
  • Slow processing – Enable GPU acceleration or reduce image size
  • Poor embedding quality – Try different models or image preprocessing

Compatibility Issues

  • Model loading errors – Verify model availability and compatibility
  • Device errors – Check CUDA installation for GPU acceleration
  • Format errors – Ensure images are in supported formats (JPEG, PNG, etc.)

Technical Reference

For detailed technical information, refer to:

  • Image Embedding Source Code located in ../../../aparavi-connectors/connectors/embedding_image/image.py