Web Services Overview

Data API

All static data that is exposed in rcsb.org is available in the Data API. The schema follows the mmCIF dictionary, extended with annotations coming from external resources. The core PDB data is split up into core objects, one per level of the structural data hierarchy, with entity subdivided into polymeric and non-polymeric subschemas (differing from the mmCIF dictionary). These are some of the core objects:

Both internal additions to the mmCIF dictionary and external resources annotations are prefixed with rcsb_. In each core object, the rcsb_<core_object>_container_identifiers field holds the cardinal identifiers for the objects and any parent/child. Additionally every core object contains a single string identifier in field rcsb_id.

The data is available via 2 different interfaces:

REST API Full Reference

The REST API permits the retrieval of all data for one core object at a time.

GraphQL API (GraphiQL interface)

The GraphQL interface offers more flexible data retrieval, essentially making it possible to grab any piece of data from any level of the hierarchy in a single query.

All output from both REST and GraphQL interfaces is offered in JSON format.

Search API

The search API programmatically exposes all search functionality available at rcsb.org. It is possible to perform queries with arbitrary Boolean logic across all data available in the RCSB PDB data API via a convenient JSON-format query language. At the root level it is also possible to combine text-based searches (any text/numerical field in the RCSB PDB data API) with protein/nucleotide sequence search (mmseqs2 software) and Structure similarity searches (BioZernike software, described in Guzenko et al 2020). All output from the Search API is offered in JSON format.

Search API Full Reference

Legacy APIs

These legacy APIs will be discontinued in November 2020.

Contact RCSB PDB with questions or suggestions for specific services.