Thursday, March 24, 2011

How Big Is the Human Genome?

The earliest direct estimates of the size of human genome clustered around 3,000 Mb (megabase pairs) or 3.0 ×109 bp (base pairs). The textbooks settled on about 3,200 Mb based mostly on reassociation kinetics. According to those results from the 1970s, roughly 10% of the genome consists of highly repetitive DNA, 25-30% is moderately repetitive and the rest is unique sequence DNA.

A study by Morton (1991) looked at all of the estimates of genome size that had been published to date and concluded that the average size of the haploid genome in females is 3,227 Mb. This includes a complete set of autosomes and one X chromosome. The sum of autosomes plus a Y chromosome comes to 3,122 Mb. The average is about 3,200 which corresonds to 3.5 pg (picograms) and that's the value on Ryan Gregory's Animal Genome Size Database.

In the past decade or so the common assumption about the size of the human genome has dropped to about 3,000 Mb. This is because the draft sequence of the human genome came in at 2,800 Mb and the so-called "finished" sequence was still considerably less than 3,200 Mb. Most people didn't realize that there were significant gaps in the draft sequence and in the "finished" sequence.

The latest information on the human genome from the human genome consortium is 3,156,105,057 bp (3,156 Mb) (Build 37 version 2, patch 2=GRCh37.p3 (November 2010)). I believe this build still has gaps around the centromeres of the chromosomes. That region consists of highly repetitive sequences that are almost impossible to clone and sequence. These regions, also known as heterochromatin, were not targets of the original sequencing project. Their total size was estimated at 198 Mb (International Human Genome Sequencing Consortium, 2004) corresponding to about 6% of the genome.

The estimate may have been too large to begin with and, in addition, I'm pretty sure that some of these heterochromatic regions are included in the total size of Build 37 v2. That means that the total size of the human genome is very likely to be ~3,200 Mb or 3.2 ×109 bp.


[Image Credit: Wikipedia: Creative Commons Attribution 2.0 Generic license]

Morton, N.E. (1991) Parameters of the Human Genome. Proc. Natl. Acad. Sci. (USA) 88:7474-7476 [free article on PubMed Central]

International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431:931-945 [doi:10.1038/nature03001]

No comments:

Post a Comment