--------------------------------------------------------------------------------
genbank 183 is out
actually 157840 flu-sequence-records (+2.0% in 2 months)
ftp://ftp.ncbi.nih.gov/genbank/gbrel.txtGBREL.TXT Genetic Sequence Data Bank
April 15 2011
NCBI-GenBank Flat File Release 183.0
Distribution Release Notes
135440924 loci, 126551501141 bases, from 135440924 reported sequences
- the VRL division is now composed of 17 files (+1)
2.2.1 File Descriptions
Files included in this release are:
...
1557. gbvrl1.seq - Viral sequence entries, part 1.
1558. gbvrl10.seq - Viral sequence entries, part 10.
1559. gbvrl11.seq - Viral sequence entries, part 11.
1560. gbvrl12.seq - Viral sequence entries, part 12.
1561. gbvrl13.seq - Viral sequence entries, part 13.
1562. gbvrl14.seq - Viral sequence entries, part 14.
1563. gbvrl15.seq - Viral sequence entries, part 15.
1564. gbvrl16.seq - Viral sequence entries, part 16.
1565. gbvrl17.seq - Viral sequence entries, part 17.
1566. gbvrl2.seq - Viral sequence entries, part 2.
1567. gbvrl3.seq - Viral sequence entries, part 3.
1568. gbvrl4.seq - Viral sequence entries, part 4.
1569. gbvrl5.seq - Viral sequence entries, part 5.
1570. gbvrl6.seq - Viral sequence entries, part 6.
1571. gbvrl7.seq - Viral sequence entries, part 7.
1572. gbvrl8.seq - Viral sequence entries, part 8.
1573. gbvrl9.seq - Viral sequence entries, part 9.
...
Uncompressed, the Release 183.0 flatfiles require roughly 489 GB (sequence
File Size File Name
...
249998875 gbvrl1.seq
249997181 gbvrl10.seq
16808055 gbvrl11.seq
249996404 gbvrl12.seq
249998509 gbvrl13.seq
249998936 gbvrl14.seq
249997870 gbvrl15.seq
249999053 gbvrl16.seq
238429865 gbvrl17.seq
249999624 gbvrl2.seq
249994915 gbvrl3.seq
249999658 gbvrl4.seq
164453727 gbvrl5.seq
249998209 gbvrl6.seq
249998635 gbvrl7.seq
249999773 gbvrl8.seq
249997917 gbvrl9.seq
...
VRL1 69597 67988073
VRL10 55226 73759817
VRL11 4179 5063852
VRL12 62617 71400140
VRL13 58385 72462844
VRL14 62822 65485288
VRL15 59572 73053270
VRL16 57413 71980516
VRL17 61164 69846358
VRL2 73438 64152240
VRL3 69896 60951713
VRL4 67799 70887231
VRL5 42441 44393369
VRL6 48350 77608209
VRL7 58303 71387958
VRL8 61757 72807020
VRL9 67995 67886914
...
most sequenced organisms in Release 183.0, :
Entries Bases Species
16863696 15675014152 Homo sapiens
8679108 9023900439 Mus musculus
2181397 6498706602 Rattus norvegicus
2198402 5380180551 Bos taurus
3925343 5053645051 Zea mays
3221262 4803375305 Sus scrofa
1704939 3127417957 Danio rerio
228260 1352941097 Strongylocentrotus purpuratus
1341465 1249691996 Oryza sativa Japonica Group
1770024 1194632155 Nicotiana tabacum
1424247 1147209201 Xenopus (Silurana) tropicalis
1218742 1054660637 Drosophila melanogaster
2316656 1013211794 Arabidopsis thaliana
214025 1002427288 Pan troglodytes
1453428 943663921 Canis lupus familiaris
661291 914659269 Vitis vinifera
810916 894552342 Gallus gallus
1889663 893141046 Glycine max
82470 826741112 Macaca mulatta
1217094 748805204 Ciona intestinalis
2.2.8 Growth of GenBank
From 1982 to the present, the number of bases in GenBank has doubled
approximately every 18 months.
Release Date Base Pairs Entries
3 Dec 1982 680338 606
36 Sep 1985 5204420 5700
66 Dec 1990 51306092 41057
92 Dec 1995 425860958 620765
121 Dec 2000 11101066288 10106023
151 Dec 2005 56037734462 52016762
181 Dec 2010 122082812719 129902276
182 Feb 2011 124277818310 132015054
183 Apr 2011 126551501141 135440924
1. PRI - primate sequences
2. ROD - rodent sequences
3. MAM - other mammalian sequences
4. VRT - other vertebrate sequences
5. INV - invertebrate sequences
6. PLN - plant, fungal, and algal sequences
7. BCT - bacterial sequences
8. VRL - viral sequences
9. PHG - bacteriophage sequences
10. SYN - synthetic sequences
11. UNA - unannotated sequences
12. EST - EST sequences (expressed sequence tags)
13. PAT - patent sequences
14. STS - STS sequences (sequence tagged sites)
15. GSS - GSS sequences (genome survey sequences)
16. HTG - HTGS sequences (high throughput genomic sequences)
17. HTC - HTC sequences (high throughput cDNA sequences)
18. ENV - Environmental sampling sequences
19. CON - Constructed sequences
__________________