Create an OpenReview Connector

An OpenReview connector enables you to ingest academic papers from OpenReview conferences into the Zeta Alpha platform. This guide shows you how to create and configure an OpenReview connector for your data ingestion workflows.

Info: This guide presents an example configuration for an OpenReview connector. For a complete set of configuration options, see the OpenReview Connector Configuration Reference.

Prerequisites

Before you begin, ensure you have:

Access to the Zeta Alpha Platform UI
A tenant created
An index created

Note: OpenReview does not require authentication credentials for public conference data.

Step 1: Create the OpenReview Basic Configuration

To create an OpenReview connector, define a configuration file with the following basic fields:

is_document_owner: (boolean) Indicates whether this connector "owns" the crawled documents. When set to true, other connectors cannot crawl the same documents.
base_url: (string) URL of the OpenReview API (e.g., "https://api.openreview.net" or "https://api2.openreview.net").
conference: (object) Configuration for the conference to crawl:
- name: (string) Name of the conference
- venues: (array of strings) List of venue identifiers to crawl within the conference
- id_type: (string) Type of identifier used for venues, typically "venueid" or "invitation"
since: (datetime) Only crawl papers submitted or updated after this datetime (format: "YYYY-MM-DDTHH:MM:SSZ").
crawl_no_decision: (boolean, optional, default: true) Whether to crawl papers that have no acceptance decision yet.
limit: (integer, optional) Maximum number of papers to crawl.
skips: (integer, optional) Number of papers to skip at the beginning.
content_source_name: (string, optional) Custom name for the content source. Defaults to "OpenReview".
logo_url: (string, optional) The URL of a logo to display on document cards

Example Configuration

{
    "name": "My OpenReview Connector",
    "description": "My OpenReview connector for NeurIPS 2024",
    "is_indexable": true,
    "connector": "openreview",
    "connector_configuration": {
        "is_document_owner": true,
        "base_url": "https://api2.openreview.net",
        "conference": {
            "name": "NeurIPS 2024",
            "venues": [
                "NeurIPS.cc/2024/Conference"
            ],
            "id_type": "venueid"
        },
        "since": "2024-01-01T00:00:00Z",
        "crawl_no_decision": true,
        "limit": 5000,
        "content_source_name": "OpenReview - NeurIPS 2024",
        "logo_url": "https://example.com/neurips-logo.png"
    }
}

Step 2: Add Field Mapping Configuration

When crawling OpenReview, the connector extracts document metadata and content as described in the OpenReview Connector Configuration Reference. You can map these OpenReview fields to your index fields using the field_mappings configuration.

Example Field Mappings

The following example shows field mappings for the default index fields:

{
    ...
    "connector_configuration": {
        ...
        "field_mappings": [
            {
                "content_source_field_name": "title",
                "index_field_name": "DCMI.title"
            },
            {
                "content_source_field_name": "abstract",
                "index_field_name": "DCMI.abstract"
            },
            {
                "content_source_field_name": "authors.full_name",
                "index_field_name": "DCMI.creator"
            },
            {
                "content_source_field_name": "created",
                "index_field_name": "DCMI.created"
            },
            {
                "content_source_field_name": "modified",
                "index_field_name": "DCMI.modified"
            },
            {
                "content_source_field_name": "subject",
                "index_field_name": "DCMI.subject"
            },
            {
                "content_source_field_name": "source",
                "index_field_name": "DCMI.source"
            },
            {
                "content_source_field_name": "uri",
                "index_field_name": "uri"
            },
            {
                "content_source_field_name": "document_content_type",
                "index_field_name": "document_content_type"
            },
            {
                "content_source_field_name": "document_content_path",
                "index_field_name": "document_content_path"
            }
        ],
        ...
    }
}

Step 3: Configure Access Rights

You can configure access rights to control who can view the ingested OpenReview papers:

allow_access_rights: (array of objects, optional) Users with any of these access rights will be able to access the documents. If not passed, no user will be able to retrieve the documents.
deny_access_rights: (array of objects, optional) Users with these access rights will not be able to access the documents.

Example Configuration with Access Rights

{
    ...
    "connector_configuration": {
        ...
        "allow_access_rights": [
            {
                "name": "public"
            }
        ],
        ...
    }
}

Step 4: Create the OpenReview Content Source

To create your OpenReview connector in the Zeta Alpha Platform UI:

Navigate to your tenant and click View next to your target index
Click View under Content Sources for the index
Click Create Content Source
Paste your JSON configuration
Click Submit

Crawling Behavior

The connector crawls papers from OpenReview conferences, extracting:

Paper title, abstract, and description
Author information
Subject areas and keywords
Submission and modification timestamps
Acceptance status
Identifiers and citations
PDF content for full-text indexing

The connector processes papers based on your time range, venue selection, and limit settings. Use the since parameter to incrementally crawl new submissions, and the crawl_no_decision parameter to control whether papers without acceptance decisions are included.

Common Conference Examples

Here are some common OpenReview conference configurations:

ICLR 2024:

{
    "base_url": "https://api2.openreview.net",
    "conference": {
        "name": "ICLR 2024",
        "venues": ["ICLR.cc/2024/Conference"],
        "id_type": "venueid"
    }
}

NeurIPS 2024:

{
    "base_url": "https://api2.openreview.net",
    "conference": {
        "name": "NeurIPS 2024",
        "venues": ["NeurIPS.cc/2024/Conference"],
        "id_type": "venueid"
    }
}

ICML 2024:

{
    "base_url": "https://api2.openreview.net",
    "conference": {
        "name": "ICML 2024",
        "venues": ["ICML.cc/2024/Conference"],
        "id_type": "venueid"
    }
}

Prerequisites​

Step 1: Create the OpenReview Basic Configuration​

Example Configuration​

Step 2: Add Field Mapping Configuration​

Example Field Mappings​

Step 3: Configure Access Rights​

Example Configuration with Access Rights​

Step 4: Create the OpenReview Content Source​

Crawling Behavior​

Common Conference Examples​