Skip to main content

Concepts

This guide introduces the core concepts of the Zeta Alpha platform configuration. Understanding these concepts is essential for effectively configuring and managing your search and discovery environment.

Tenant

A tenant represents a logical separation of data and configuration within the Zeta Alpha platform. Think of a tenant as an isolated environment that can represent a use case requiring its own distinct settings and data space.

Each tenant can contain one or more indexes, with each index having its own configuration and field definitions. Beyond indexes, a tenant also manages platform-wide settings including:

  • Data storage locations and backends
  • External API keys and integrations
  • Authentication and access control
  • Visible indexes and user interface customization
  • Feature enablement and configuration

Index

An index is a searchable collection of documents with associated configurations that control how documents are displayed, searched, and filtered. The index configuration is where you define the schema of your documents-specifying which fields exist, their data types, and their behavior during indexing and retrieval.

Key aspects of an index include:

  • Field definitions (searchable, filterable, sortable properties)
  • Search relevance settings and boosting configurations
  • Display configurations for the user interface
  • Neural search and embedding settings
  • Storage and capacity configurations

An index aggregates data from one or more content sources, each potentially representing a different origin or type of document.

Content Source

A content source defines where documents originate and how they should be ingested into the system. Content sources act as the bridge between external data repositories and your searchable index.

Content sources can be configured using:

  • Built-in connectors: Pre-built integrations for popular platforms (e.g., Google Drive, Confluence, SharePoint, Slack, Teams, GitHub, etc.)
  • Custom connectors: User-defined integrations for proprietary or specialized data sources

Each content source can reference a workflow, which specifies the processing pipeline that transforms raw documents into searchable, enriched content.

Workflow

A workflow defines the sequence of processing steps applied to documents as they move from ingestion to indexing. Workflows orchestrate the transformation of raw data into searchable, structured documents with enriched metadata.

A typical workflow might include steps such as:

  • Text extraction from various file formats
  • Content chunking and segmentation
  • Metadata extraction and enrichment
  • Vector embedding generation for neural search
  • Custom processing and transformations

By configuring workflows, you control exactly how your documents are processed and what information is extracted and indexed.

Document

A document is the fundamental unit of information in the Zeta Alpha platform. Each document consists of multiple fields with different data types (text, dates, numbers, etc.) and configurations.

Once a document is indexed, it becomes:

  • Searchable: Users can find it through keyword and semantic search
  • Filterable: Users can narrow results using field-based filters
  • Retrievable: All configured fields are available in search results

Documents can have one or more enhancements attached to them, providing additional layers of information beyond the core document data.

Enhancements

Enhancements are supplementary pieces of information attached to documents to improve search quality, relevance, and user experience. Unlike the core document fields, enhancements are typically computed or aggregated after initial ingestion.

Examples of enhancements include:

  • Author profiles: Detailed information about document creators
  • Popularity metrics: Citation counts, view counts, or engagement scores
  • Related content: Connections to similar or related documents
  • Tags and classifications: Automatically generated or curated labels
  • Extracted entities: People, organizations, locations mentioned in the content

Enhancements allow you to layer rich contextual information onto your documents without modifying the original document structure.

Ingestion Jobs

An ingestion job is a task that imports documents from a content source into an index. Ingestion jobs serve as the mechanism for populating and updating your searchable content.

Ingestion jobs can be:

  • Scheduled: Automatically triggered at regular intervals to keep content synchronized
  • Manual: Initiated on-demand when you need immediate updates
  • Incremental: Processing only new or modified documents
  • Full: Re-processing all documents from the source

When an ingestion job runs, it follows the content source configuration to communicate with the appropriate connector service, retrieve documents, apply the configured workflow, and index the resulting data.