Overview
Embed is a feature that generates embeddings from Corvic Tables, creating multi-modal and multi-structural embedding spaces. This feature supports semantic, tabular, relational, and multi-media (images, videos, etc.) embeddings, enabling you to create rich vector representations of your data for use in agents, search, and analysis workflows.
Category
Corvic Tables - This feature is designed to work with Corvic Tables, enabling you to generate embeddings from structured data for use in downstream AI workflows.
Corvic Table - The Embed feature accepts a Corvic Table as input. Select an existing Corvic Table from your data room that contains the data you want to embed.
The input must be a Corvic Table. You can create Corvic Tables from data sources using features like Sanitize Parquet for structured data, Multi-modal Knowledge Extraction for unstructured data, or Augment for enhanced data.
Output
Space - The Embed feature produces an embedding vector store (Space) containing multi-modal and multi-structural embeddings. The output Space includes:
- Semantic Embeddings: Text-based semantic representations
- Tabular Embeddings: Structured data encodings
- Relational Embeddings: Graph-based relationship representations
- Multi-media Embeddings: Image and video content embeddings
The output Space serves as a vector store that can be used by agents for semantic search, similarity matching, and intelligent data retrieval. Spaces enable powerful AI workflows that understand and work with your data at scale.
Learn more: Spaces - Detailed guide on embedding spaces and how to generate embeddings
Parameters
| Parameter | Type | Required | Description |
|---|
input | string | Yes | The Corvic Table to generate embeddings from. Select a Corvic Table from your data room that contains the data you want to embed. |
output_name | string | No | Optional custom name for the output Space. If not provided, a default name will be automatically generated based on the input Corvic Table name and embedding types. |
description | string | No | Optional description of the embedding space. Provides context about the purpose and contents of the generated embeddings. |
embedding_type | selection | Yes | Type of embedding to generate. Select one or more: text_semantic, tabular_encoding, relational, or image_semantic. You can configure multiple embedding types for the same Corvic Table. |
Text Semantic Parameters
When embedding_type includes text_semantic, configure the following parameters:
| Parameter | Type | Required | Description |
|---|
model | string | Yes | Text embedding model selection. Choose from available models or use “bring your own” if your admin has added custom model endpoints. Options include OpenAI embeddings, sentence transformers, and custom models configured by administrators. |
columns_to_embed | array | Yes | Columns to embed. Select which text columns from the input Corvic Table should be embedded. You can change column types or rearrange columns as required before embedding. |
Tabular Encoding Parameters
When embedding_type includes tabular_encoding, configure the following parameters:
| Parameter | Type | Required | Description |
|---|
columns_to_embed | array | Yes | Columns to embed. Select which columns from the input Corvic Table should be encoded. You can change column types or rearrange columns as required before embedding. |
Relational Parameters
When embedding_type includes relational, configure the following parameters:
| Parameter | Type | Required | Description |
|---|
graph_embedding_model | string | Yes | Graph embedding model selection. Choose from: graph_structure_embed for graph structure embeddings or graph_neural_network for graph neural network-based embeddings. |
Image Semantic Parameters
When embedding_type includes image_semantic, configure the following parameters:
| Parameter | Type | Required | Description |
|---|
model | string | Yes | Image embedding model selection. Choose from available models: clip, sigclip2, or use “bring your own” if your admin has added custom image model endpoints. |
columns_to_embed | array | Yes | Columns to embed. Select columns containing multi-media content such as images. Specify which columns from the input Corvic Table contain image data that should be embedded. |
Usage Example
To use Embed in a Data App:
- Add your Corvic Table to the Data App canvas
- Click the ”+” button next to the Corvic Table
- Select “Embed” from the actions menu
- Select the input Corvic Table (if not already selected)
- Choose one or more embedding types (Text Semantic, Tabular Encoding, Relational, Image Semantic)
- Configure the parameters for each selected embedding type:
- Text Semantic: Select model and columns to embed
- Tabular Encoding: Select columns to embed
- Relational: Select graph embedding model
- Image Semantic: Select image model and columns with image content
- Optionally provide a name and description for the output Space
- Run the Data App to execute the embedding generation
- Review the generated Space containing multi-modal and multi-structural embeddings