DNA as heredity’s molecule is one of biology’s greatest discoveries. It seemed to make humanity’s blueprint – our “genome” – discoverable too. Merely identify all genes in human DNA and the job would be done. The gene identifying was achieved in 2001, but instead of man’s blueprint, science got two shocking surprises. Now, one of them has revealed genome complexity so vast that the 21st century may end without our comprehending it.
This surprise was that only 1% of human DNA seemed to contain “genes.” “Gene” means DNA that codes for a protein. So when the other 99% didn’t code for proteins, it was called “junk” – useless.
“You’re kidding?!” thought many scientists. 3 billion steps in DNA’s helix and 99% of them are useless baggage passed from parent to child, generation after generation?! If the so-called junk wasn’t genes, they proposed, maybe it was something else. Maybe it contained “elements” – a vague catch-all — of DNA that did something.
This turned out to be true. World wide research since 2003 has now found biological functions in over 80% of the former junk. What functions? Regulation of the protein-coding genes. A human’s 20,000 genes seem to be like tools in a toolbox, which the ex-junk adjusts with colossal versatility, using 470,000 “elements” now identified (and others still undiscovered). (ENCODE, the work’s cumbersome acronym, stands for ENCyclopedia Of DNA Elements.)
This helps explain the second surprise from 2001’s human genome map: that our 20,000 genes are fewer than possessed by a fruit fly (25,000), a tomato (30,000) or an apple (50,000). How, everyone wondered, could all of human complexity – our minds, our intricate bodies — be manufactured with fewer tools than nature uses on things we swat or eat? The answer, it now seems, is that colossal versatility.
What sorts of adjustments – gene regulation — do these “elements” in ex-junk perform? Turning genes on or off is one. In 70,000 locations, the ex-junk “promotes” expression of nearby genes; this means on/off control and perhaps (so much remains unknown) adjusting how long the “on” signal lasts.
And how, one might wonder, would an “element” in DNA turn a gene on? One answer: by being a perch for proteins that, like eagles ripping prey, wrench the helix into sharp angles and rip it open for other molecules to read the genetic code and deliver it to protein factories in the cell. (“Transcription factor” is the catch-all name for these molecular raptors that start “transcribing” genetic code.)
Other “elements” in ex-junk code for various RNA molecules which then latch elsewhere on the DNA to affect protein production. Still others affect genes by altering the proteins that DNA wraps around (like yarn on spools).
In 400,000 other locations, “elements” of ex-junk affect expression of genes that lie far away on the DNA helix. How can they act far away? Because “far away” has a peculiar meaning here. DNA in each cell would be about 6 feet long if laid in a straight line, which is how diagrams of genes are sometimes drawn. “Far” means far on that straight line. But DNA in cells isn’t laid out flat. The 6 feet are a dense tangle inside the microscopic cell nucleus. In the tangle, DNA “elements” may be kissing, despite lying far from each other on a straight-line diagram. Maybe, but no one knows. 3-D mapping of the tangle is just getting underway. With 400,000 possible kisses to map, it’s a big job.
What phenomenon in nature?
99% of human DNA doesn’t contain genes that code for proteins.
What did this discovery show?
The 99% isn’t inert junk. Instead, at least 80% of it contains lengths of DNA (“elements” is still the catch-all label) that regulate the gene-containing 1%.
What was known before?
The 99% was passed on from generation to generation, but not whether it did anything useful.
What remains unknown?
A century’s worth of work, according to some researchers. An obvious question: is the other 20% really still junk? More profoundly, how do these “elements” manage the protein genes? And even if that were known for, say, liver cells, the answer’s probably different for cells of brain, kidney, lung…or any other cells. All cells have the same genome, but these “elements” operate differently in each type. The complexity makes obvious how a century could be occupied in exploring it.
The research on DNA data elements (ENCODE) conducted over nine years, was reported in 30 scientific papers. 489 Nature 45 ff. (6 September 2012) contains 6 of them and extensive analysis and description.