This component generates vector embeddings from images. It converts visual content into numerical representations that capture semantic meaning, enabling similarity search, clustering, and other operations on image data.
Inputs
- Images – Image files to convert to embeddings
- Documents – Document objects containing images
Outputs
- Vectors – Generated vector embeddings
- Documents – Original documents with embeddings attached
Configuration
Model Settings
- Model – Image embedding model to use
- Default – “clip-vit-base-patch32”
- Available models – CLIP, ResNet, EfficientNet
- Custom model –Â User-defined embedding engine, not configured in this view.
- Google – 16×16 –Â Fast, accurate, general-purpose embeddings.
- OpenAI – 16×16 –Â Good performance with lower memory usage.
- OpenAI – 32×32 –Â Lower performance, better image recognition.
- Dimensions – Vector dimensions
- Default – 512
- Note – Model dependent
- Batch Size – Number of images to embed at once
- Default – 16
- Note – Affects memory usage
- Normalize – Normalize vector lengths
- Default – true
- Note – Improves similarity calculations
Image Processing
- Resize – Resize images before embedding
- Default – true
- Note – Ensures consistent input size
- Target Size – Target image dimensions
- Default – [224, 224]
- Note – Width and height in pixels
- Center Crop – Apply center cropping
- Default – true
- Note – Maintains aspect ratio
- Color Mode – Color processing mode
- Default – “RGB”
- Options – RGB, grayscale
Advanced Settings
- Cache – Cache embeddings for reuse
- Default – true
- Note – Improves performance for repeated images
- Device – Processing device
- Default – “auto”
- Options – auto, cpu, cuda
- Precision – Computation precision
- Default – “float32”
- Options – float32, float16, bfloat16
Basic Image Embedding
This example shows how to configure the Image Embedding component for basic image embedding:
json
{
"model": "clip-vit-base-patch32",
"dimensions": 512,
"batchSize": 16,
"normalize": true,
"resize": true,
"targetSize": [224, 224],
"centerCrop": true,
"colorMode": "RGB"
}
High-Performance Image Embedding
For high-performance image embedding with GPU acceleration:
json
{
"model": "clip-vit-large-patch14",
"dimensions": 768,
"batchSize": 32,
"normalize": true,
"resize": true,
"targetSize": [336, 336],
"centerCrop": true,
"colorMode": "RGB",
"cache": true,
"device": "cuda",
"precision": "float16"
}
Best Practices
Model Selection
- Use CLIP models for general-purpose image embeddings
- Use ResNet models for traditional computer vision tasks
- Use EfficientNet for resource-constrained environments
Image Preparation
- Ensure consistent image sizes through resizing
- Consider image quality and resolution
- Use center cropping to maintain important visual elements
- Preprocess images to remove noise or irrelevant content
Performance Optimization
- Adjust batch size based on available memory
- Use GPU acceleration when available
- Enable caching for repeated processing of the same images
- Use lower precision (float16) for faster processing with minimal quality loss
Troubleshooting
Processing Problems
- Out of memory errors – Reduce batch size or image dimensions
- Slow processing – Enable GPU acceleration or reduce image size
- Poor embedding quality – Try different models or image preprocessing
Compatibility Issues
- Model loading errors – Verify model availability and compatibility
- Device errors – Check CUDA installation for GPU acceleration
- Format errors – Ensure images are in supported formats (JPEG, PNG, etc.)
Technical Reference
For detailed technical information, refer to:
- Image Embedding Source Code located in ../../../aparavi-connectors/connectors/embedding_image/image.py