Content Source Custom Metadata
Content source custom metadata lets you attach static key-value pairs to every document ingested by a content source. This is useful when you need all documents from a source to carry the same metadata — for example, a visibility level, an owner, or an information classification — without setting up an enhancement connector.
The metadata is injected at ingestion time by the connector SDK. It works with all standard connectors (SharePoint, Confluence, S3, Google Drive, Slack, GitHub, etc.).
How It Works
- You add a
custom_metadataobject to the connector configuration of your content source. - On every ingestion run, each crawled document receives the key-value pairs from
custom_metadata. - If a document already has a field with the same key (from the connector's crawled data), the document's value is kept — content-source-level metadata never overwrites document-level metadata.
Avoid using reserved document field names (title, authors, uri, created_at, last_updated_at, content_source_id, allow_access_rights, deny_access_rights, logo_url) as keys in custom_metadata — these are treated as top-level document fields and may produce unexpected results.
Example Configuration
The following SharePoint connector configuration attaches visibility and displayed_owner to every ingested document:
{
"name": "Product Documentation",
"description": "SharePoint document library for product docs",
"is_indexable": true,
"connector": "sharepoint",
"schedule": "0 0 * * *",
"connector_configuration": {
"sharepoint": {
"is_document_owner": true,
"content_source_name": "Product Docs",
"access_credentials": {
"client_id": "my_client_id",
"client_secret": "my_client_secret",
"tenant_id": "my_tenant_id"
},
"site_paths": [
{
"collection_hostname": "company.sharepoint.com",
"site_relative_path": "sites/product-docs"
}
],
"custom_metadata": {
"visibility": "INTERNAL",
"displayed_owner": "Jane Doe",
"information_classification": "CONFIDENTIAL"
}
}
}
}
After ingestion, every document from this content source will have visibility, displayed_owner, and information_classification in its metadata. These fields can be used for filtering and access control at query time, provided they are configured in the index's document_fields_configuration.
Updating Custom Metadata
To update the metadata, modify the custom_metadata object on the content source and trigger a re-crawl (manually or via the next scheduled run). The updated values will be applied to all documents during the next ingestion.
When to Use Custom Metadata vs. Enhancements
| Scenario | Recommended approach |
|---|---|
| Same metadata for all documents in a content source | custom_metadata on the connector configuration |
| Different metadata per document based on document content | Metadata Extractor Enhancement or Agent Processor Enhancement |
| Metadata from an external system, matched per document | Join Enhancement or Custom Enhancement |
| User-defined tags on individual documents | Tags Enhancement |
API Reference
See BaseConnectorConfiguration.custom_metadata in the API reference for the full field specification.