PDB Statistics: Growth in Number of Unique Protein Sequences in Released PDB Structures (Cumulative) at Identity 90%

This chart shows the annual and cumulative numbers of protein sequences in released PDB structures. The chart can be viewed for a few different levels of sequence identity since the beginning of the PDB archive. The cumulative bars represent the growth in unique protein sequences (number of polymeric entities) across history. The yearly bars (dark blue) tell how many new protein sequences were added in a certain year.

Note: The total number of sequence clusters in the statistics table is linked to the sequence cluster group search result page. There is a default precision threshold in calculating the numbers for performance balance. So the statistics count may have a slight discrepancy compared to the actual non-redundant group search result when the result count approaches or goes above 10,000. The group search result page provides an accurate count. The statistics page provides the trend.

Chart is currently loading

Sequence cluster level:

YearNumber of New Protein SequencesTotal Number of Protein Sequences
19761313
19771326
1978329
1979534
1980337
1981845
19821863
19831073
19841184
19851094
19869103
198710113
198823136
198945181
199044225
199149274
199265339
1993217556
1994436992
19953461338
19963821720
19975512271
19987363007
19999143921
200010244945
200110756020
200211297149
200316018750
2004219910949
2005249413443
2006280816251
2007317119422
2008291122333
2009302825361
2010304728408
2011285731265
2012299334258
2013323837496
2014401841514
2015337044884
2016375748641
2017407452715
2018389456609
2019431360922
2020514666068
2021459670664
2022334174005