PDB Statistics: Growth in Number of Unique Protein Sequences in Released PDB Structures (Cumulative) at Identity 30%

This chart shows the annual and cumulative numbers of protein sequences in released PDB structures. The chart can be viewed for a few different levels of sequence identity since the beginning of the PDB archive. The cumulative bars represent the growth in unique protein sequences (number of polymeric entities) across history. The yearly bars (dark blue) tell how many new protein sequences were added in a certain year.

Note: The total number of sequence clusters in the statistics table is linked to the sequence cluster group search result page. There is a default precision threshold in calculating the numbers for performance balance. So the statistics count may have a slight discrepancy compared to the actual non-redundant group search result when the result count approaches or goes above 10,000. The group search result page provides an accurate count. The statistics page provides the trend.

Chart is currently loading

Sequence cluster level:

YearNumber of New Protein SequencesTotal Number of Protein Sequences
19761111
19771122
1978325
1979126
1980228
1981735
19821752
1983557
1984966
1985773
1986780
1987787
198816103
198926129
199028157
199135192
199247239
1993137376
1994279655
1995233888
19962611149
19973871536
19984541990
19996142604
20007323336
20017394075
20027814856
200310865942
200415237465
200515869051
2006182910880
2007198612866
2008190614772
2009193316705
2010190318608
2011170220310
2012178922099
2013185623955
2014222226177
2015201528192
2016214830340
2017230232642
2018230634948
2019243337381
2020293640317
2021239542712
2022171044422