|
In mid February two competing groups, one a publicly-funded international
consortium and the other from the genomics company Celera, announced
the publication of the first draft of the nearly completed sequence
and the resulting map of all the genes of the human.
Most astonishing was that both groups found only about 30,000 genes.
Previous estimates over the past decade have ranged from 70,000 to 130,000.
But those 30,000 genes tend to be more complex, and more multi-functional,
than many people anticipated.
This is the equivalent of expecting to find 100,000 different single-use
tools, but then finding 30,000 Swiss Army Knives with several gadgets
on each.
According to a statement released by the American Association for the
Advancement of Science (AAAS), the human genome sequence published in
the journal Science is estimated to have an average sequencing accuracy
of 99.96 percent. The researchers color-coded the map as a way to propose
functions for two-thirds of all identified genes.
The statement claimed that the map accurately covers 95 percent of
the genome, and that the total number of human genes is somewhere between
26,383 and 39,114. If the final tally lies somewhere in between, say,
around 30,000, then people have only about 13,000 more genes than the
fruit fly.
Francis Collins, leader of the public consortium, and Craig Venter,
CEO of Celera, had jointly announced at a White House ceremony seven
months ago the approaching completion of the sequencing. Since then
both groups have been working to make sense of the sequence, counting
the genes, finding their locations on the 23 chromosomes, and compiling
a map.
The Human Genome Project (HGP) began in the late 1980's and researchers
have completed the human genome map ahead of schedule and under budget.
At least forty other model organisms have been sequenced as part of
the HGP, including bacteria, yeast, a plant, the fruit fly, and a nematode
worm. Model organisms are chosen to speed research in genomics because
they are relatively simple, fast, cheap and small.
Major genomics work at the University of Wisconsin-Madison includes
the sequencing of the harmless K12 strain of E. coli K12 bacterium in
1997 and of the dangerous food-borne strain called 0157:H7 earlier this
year. UW-Madison researchers also provide an international facility
dedicated to systematically figuring out the functions of all the genes
on the model plant called Arabidopsis.
Insights from the Human Genome Project
Speaking in San Francisco to the annual meeting of the American Association
for the Advancement of Science, Francis Collins pointed out a list of
surprising findings.
Ten. The genome is "lumpy". Some stretches of DNA have more
genes than others.
Nine. The human gene count is much lower than expected. Not
100,000 genes, but rather 30,000 or so. In comparison, yeast has about
6,000; the fruit fly, 15,000; the nematode worm C. elegans, 20,000;
the mustard plant Arabidopsis, 25,000.
Eight. Human genes can make more proteins than genes from microbes.
On average, each human gene can make about 3 proteins.
Seven. Humans make around 95,000 proteins and the proteins can
be decorated and are more elaborately constructed than proteins tend
to be in microbes. Furthermore, the proteins are more intricately regulated
as to when they are made or not made. Collins makes the comparison between
microbial and human protein pool as being like a knife compared to a
Cuisinart.
Six. More than 200 human genes are the result of transfer from
bacteria, and no similar genes among these 200 are found in fruit flies
or yeast. It seems that bacterial genes have breached the traditional
boundaries and are now part of our stuff. "This sort of puts a new face
on recombinant DNA, doesn't it?" he added.
Five. The humane genome provides a 'fossil record' that looks
back 800 million years, especially the genes called transposable elements
that have changed over time, and may represent the echo of genes that
we shared with evolutionary ancestors.
Four. A major component of the repetitive DNA, called "junk
DNA" by some, is somehow an advantage especially in areas where genes
are tightly packed.
Three. The male mutation rate is about twice the female mutation
rate. Males account for the majority of the disease causing mutations.
Males also account for the majority of evolutionary progress
Two. Different human individuals are 99.9% identical at the
DNA level, and most of our genetic differences are shared among all
ethnicities and races. There is no scientific basis for precise racial
categories. The definition of race and ethnicity is something that biological
science cannot support, but is rather a social or cultural construct
One. The genome tells us even more about human biology, health
and disease than we expected. Collins noted the mouse sequence is now
80% complete. Researchers anticipate new insights from comparing the
mouse sequence to the human.
"Speed Matters": Venter Takes a Different Tack
In his speech to AAAS, Venter told a broader story of developing
new techniques and strategies for faster sequencing. While the public
group has been a model of international cooperation and planning, Celera's
group developed a faster strategy (called shotgun cloning) based on
inventing new DNA sequencing robots, writing innovative computer software
for analyzing the pieces and putting them together into a picture, and
building faster supercomputers for crunching all the data.
The human sequence is about 3 billion letters or 'base-pairs' of DNA
long. Figuring out the sequence at a rate of one letter per second would
take 3 billion seconds--or roughly 99 years. The genome project's timetable
in October, 1993, called for finishing the sequence by 2005.
Things have sped up considerably. Celera's capacity is now at least
2 billion letters of DNA per month. In other words, another genome the
size of humans could be sequenced every six weeks. Venter claims a bacterial
genome the size of E. coli's can be sequenced in a morning's work. While
these figures are only for the sequencing portion of the overall mapping
work, they show the acceleration of the capacity for mapping the genomes.
It's kind of like comparing a roundtrip from St. Louis, Missouri, to
Astoria, Oregon. It took Lewis and Clark almost 2 1/2 years. Now it
can take as few as 10 hours, airport to airport and back.
One also gets different types of information depending one's vantage
point. You learn different things from the canoe seat compared to the
cockpit, or compared to photos from a satellite.
As getting genomic information becomes faster and cheaper, that increases
the likelihood in the coming decade of using genomics in personal medical
and health decisions. If the TV series "ER" runs as long as Saturday
Night Live has lasted (now in its 26th season), I won't be too surprised
in 20 years to see "ER's" docs ordering a genomic work-up for each patient
as routinely as a blood chemistry work-up. But issues outside the ER,
the issues of public health, individual privacy, and ethics of reproduction
will likely be the knottiest issues.
www.nhgri.nih.gov/educationkit
www.aaas.org/news/human.html
www.celera.com/
www.nature.com/genomics/
|