Jump to content

DNA: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
revert Lir's change to introductory paragraph.
168... (talk | contribs)
No edit summary
Line 3: Line 3:
<div style="float:right; text-align: center">[[image:dna-split.png|DNA replication]]<br>''DNA replication''</div>
<div style="float:right; text-align: center">[[image:dna-split.png|DNA replication]]<br>''DNA replication''</div>


'''Deoxyribonucleic acid''' ('''DNA''') is the primary [[biochemistry|chemical]] component of [[chromosome]]s and is the material of which [[gene]]s are made. It is sometimes called the "molecule of heredity," because parents transmit copied portions of their own DNA to offspring during [[reproduction]], and because they propagate their traits by doing so.
<!-- Please DO NOT edit the first two paragraphs, below. These paragraphs caused significant debate. Instead, suggest any putative changes on the talk page. -->


In [[bacterium|bacteria]] and other [[prokaryote|simple]] [[biological cell|cell]] organisms, DNA is distributed more or less throughout the cell. In the [[eukaryote|complex]] cells that make up [[plant]]s, [[animal]]s and in other multi-celled [[organism]]s, most of the DNA resides in the [[cell nucleus]]. The energy generating [[organelle]]s known as [[chloroplast]]s and [[mitochondria]] also carry DNA, as do many [[virus]]es.
'''Deoxyribonucleic acid''' ('''DNA''') is a [[nucleic acid]] which carries [[genetics|genetic]] [[instruction]]s for the [[developmental biology|biological development]] of all cellular forms of [[life]] and many [[virus]]es. DNA is sometimes referred to as the [[molecule]] of [[heredity]] as it is [[biological inheritance|inherited]] and used to propagate [[trait]]s. During [[reproduction]], it is [[DNA replication|replicated]] and transmitted to offspring.

In [[bacterium|bacteria]] and other [[prokaryote|simple]] [[biological cell|cell]] organisms, DNA is distributed more or less throughout the cell. In the [[eukaryote|complex]] cells that make up [[plant]]s, [[animal]]s and in other multi-celled [[organism]]s, most of the DNA is found in the [[chromosome]]s, which are located in the [[cell nucleus]]. The energy generating [[organelle]]s known as [[chloroplast]]s and [[mitochondria]] also carry DNA, as do many [[virus]]es.

<!-- Please DO NOT edit the first two paragraphs, above. These paragraphs caused significant debate. Instead, suggest any putative changes on the talk page. -->


== Overview of molecular structure ==
== Overview of molecular structure ==

Revision as of 00:35, 14 February 2004

For alternative meanings see DNA (disambiguation)
DNA replication
DNA replication

Deoxyribonucleic acid (DNA) is the primary chemical component of chromosomes and is the material of which genes are made. It is sometimes called the "molecule of heredity," because parents transmit copied portions of their own DNA to offspring during reproduction, and because they propagate their traits by doing so.

In bacteria and other simple cell organisms, DNA is distributed more or less throughout the cell. In the complex cells that make up plants, animals and in other multi-celled organisms, most of the DNA resides in the cell nucleus. The energy generating organelles known as chloroplasts and mitochondria also carry DNA, as do many viruses.

Overview of molecular structure

Although sometimes called "the molecule of heredity," pieces of DNA as people typically think of them are not single molecules. Rather, they are pairs of molecules, which entwine like vines to form a double helix (top half of the illustration at the right).

Each vine-like molecule is a strand of DNA: a chemically linked chain of nucleotides, each of which consists of a sugar, a phosphate and one of four kinds of aromatic "bases." Because DNA strands are composed of these nucleotide subunits, they are polymers.

The diversity of the bases means that there are four kinds of nucleotides, which are commonly referred to by the identity of their bases. These are adenine (A), thymine (T), cytosine (C), and guanine (G).

In a DNA double helix, two polynucleotide strands come together through complementary pairing of the bases, which occurs by hydrogen bonding. Each base forms hydrogen bonds readily to only one other -- A to T and C to G -- so that the identity of the base on one strand dictates what base must face it on the opposing strand. Thus the entire nucleotide sequence of each strand is complementary to that of the other, and when separated, each may act as a template with which to replicate the other (middle and lower half of the illustration at the right).

Because pairing causes the nucleotide bases to face the helical axis, the sugar and phosphate groups of the nucleotides run along the outside, and the two chains they form are sometimes called the "backbones" of the helix. In fact, it is chemical bonds between the phosphates and the sugars that link one nucleotide to the next in the DNA strand.

Mechanical properties relevant to biology

Because hydrogen bonds are weak compared to covalent bonds, the strands of the double helix can be easily separated by enzymes or even, as in PCR, by gentle heating. On the other hand, gentle heating works only on pieces of DNA that are less than about 10,000 base pairs (10 kilobase pairs, or 10 kbp) long. The intertwining of the DNA strands makes long segments difficult to separate. Enzymes known as helicases unwind the strands to facilitate the advance of sequence-reading enzymes such as DNA polymerase. The unwinding requires that helicases chemically cleave the phosphate backbone of one of the strands so that it can swivel around the other.

When the ends of a piece of double-helical DNA are joined so that it forms a circle, as in plasmid DNA, the strands are topologically knotted. This means they cannot be separated by gentle heating or by any process that does not involve breaking a strand. The task of unknotting topologically linked strands of DNA falls to enzymes known as topoisomerases. Some of these enzymes unknot circular DNA by cleaving two strands so that another double-stranded segment can pass through. Unknotting is required for the replication of circular DNA as well as for various types of recombination in linear DNA.

Space-filling model of a section of DNA molecule
Space-filling model of a section of DNA molecule

The DNA helix can assume one of three slightly different geometries, of which the "B" form described by James Watson and Francis Crick is believed to predominate in cells. It is 2 nanometers wide and extends 3.4 nanometers per 10 bp of sequence. This is also the approximate length of sequence in which the helix makes one complete turn about its axis. This frequency of twist (known as the helical pitch) depends largely on stacking forces that each base exerts on its neighbors in the chain.

The narrow breadth of the double helix makes it impossible to detect by conventional electron microscopy, except by heavy staining. At the same time, the DNA found in many cells can be macroscopic in length -- approximately 5 centimeters long for strands in a human chromosome. Consequently, cells must compact or "package" DNA to carry it within them. This is one of the functions of the chromosomes, which contain spool-like proteins known as histones, around which DNA winds.

The B form of the DNA helix twists 360° per 10.6 bp in the absence of strain. But many molecular biological processes can induce strain. A DNA segment with excess or insufficient helical twisting is referred to, respectively, as positively or negatively "supercoiled". DNA in vivo is typically negatively supercoiled, which facilitates the unwinding of the double-helix required for RNA transcription.

The two other known double-helical forms of DNA, called A and Z, differ modestly in their geometry and dimensions. The A form appears likely to occur only in dehydrated samples of DNA, such those used in crystallography experiments, and possibly in hybrid pairings of DNA and RNA strands. Segments of DNA that cells have methylated for regulatory purposes may adopt the Z geometry, in which the strands turn about the helical axis like a mirror image of the B form.

The role of the sequence

Within a gene, the sequence of nucleotides along a DNA strand defines a protein, which an organism is liable to manufacture or "express" at one or several points in its life using the information of the sequence. The relationship between the nucleotide sequence and the amino-acid sequence of the protein is determined by simple cellular rules of translation, known collectively as the genetic code. Reading along the "protein-coding" sequence of a gene, each successive sequence of three nucleotides (called a codon) specifies or "encodes" one amino acid.

In many species of organism, only a small fraction of the total sequence of the genome appears to encode protein. The function of the rest is a matter of speculation. It is known that certain nucleotide sequences specify affinity for DNA binding proteins, which play a wide variety of vital roles, in particular through control of replication and transcription. These sequences are frequently called regulatory sequences, and researchers assume that so far they have identified only a tiny fraction of the total that exist. "Junk DNA" represents sequences that do not yet appear to contain genes or to have a function.

Sequence also determines a DNA segment's susceptibility to cleavage by restriction enzymes, the quintessential tools of genetic engineering. The position of cleavage sites throughout an individual's genome determines one kind of an individual's "DNA fingerprint".

DNA sequence reading

The asymmetric shape and linkage of nucleotides means that a DNA strand always has a discernable orientation or directionality. Because of this directionality, close inspection of a double helix reveals that, although the nucleotides along one strand are heading one way (e.g. the "ascending strand") the others are heading the other (e.g. the "descending strand"). This arrangement of the strands is called antiparallel.

For reasons of chemical nomenclature, people who work with DNA refer to the asymmetric termini of each strand as the 5' and 3' ends (pronounced "five prime" and "three prime"). DNA workers and enzymes alike always read nucleotide sequences in the "5' to 3' direction". In a vertically oriented double helix, the 3' strand is said to be ascending while the 5' strand is said to be descending.

As a result of their antiparallel arrangement and the sequence-reading preferences of enzymes, even if both strands carried identical instead of complementary sequences, cells could properly translate only one of them. The other strand a cell can only read backwards. Molecular biologists call a sequence "sense" if it is translated or translatable, and they call its complement "antisense". It follows then, somewhat paradoxically, that the template for transcription is the antisense strand. The resulting transcript is an RNA replica of the sense strand and is itself sense.

Some viruses blur the distinction between sense and antisense, because certain sequences of their genomes do double duty, encoding one protein when read 5' to 3' along one strand, and a second protein when read in the opposite direction along the other strand. As a result, the genomes of these viruses are unusually compact for the number of genes they contain, which biologists view as an adaptation.

Topologists like to note that the juxtaposition of the 3' end of one DNA strand beside the 5' end of the other at both termini of a double-helical segment makes the arrangement a "crab canon".

More on DNA replication

...DNA replication...origin of replication...chromosome...plasmid...DNA polymerase...mutation...[a paragraph including these ideas would be useful and go well here]

Single-stranded DNA and repair of mutations

In some viruses DNA appears in a non-helical, single-stranded form. Because many of the DNA repair mechanisms of cells work only on paired bases, viruses that carry single-stranded DNA genomes mutate more frequently than they would otherwise. As a result, such species may adapt more rapidly to avoid extinction. The result would not be so favorable in more complicated and more slowly replicating organisms, however, which may explain why only viruses carry single-stranded DNA. These viruses presumably also benefit from the lower cost of replicating one strand versus two.

The discovery of DNA and the double helix

Working in the 19th century, biochemists initially isolated DNA and RNA together from cell nuclei. They were relatively quick to appreciate the polymeric nature of their "nucleic acid" isolates, but realized only later that nucleotides were of two types--one containing ribose and the other deoxyribose. It was this subsequent discovery that led to the identification and naming of DNA as a substance distinct from RNA. Not until 1943 did Oswald Theodore Avery provide the first compelling evidence that DNA could carry genetic information.

How it could do so was unimaginable at the time. Because chemical dissection of DNA samples always yielded the same four nucleotides, the chemical composition of DNA appeared simple, perhaps even uniform. Organisms, on the other hand, are fantastically complex individually and widely diverse collectively. Geneticists did not speak of genes as conveyors of "information" in such words, but if they had, they would not have hesitated to quantify the amount of information that genes need to convey as vast. The idea that information might reside in a chemical in the same way that it exists in text--as a finite alphabet of letters arranged in a sequence of unlimited length--had not yet been conceived. It would emerge upon the discovery of DNA's structure, but few researchers imagined that DNA's structure had much to say about genetics.

In the 1950s, only a few groups made it their goal to determine the structure of DNA. These included an American group led by Linus Pauling, and two in Britain. At Cambridge University, Crick and Watson were building physical models using metal rods and balls, in which they incorporated the known chemical structures of the nucleotides, as well as the known position of the linkages joining one nucleotide to the next along the polymer. At King's College, London, Maurice Wilkins and Rosalind Franklin were examining x-ray diffraction patterns of DNA fibers.

A key inspiration in the work of all of these teams was the discovery in 1948 by Pauling that many proteins included helical (see alpha helix) shapes. Pauling had deduced this structure from x-ray patterns. Even in the initial crude diffraction data from DNA, it was evident that the structure involved helices. But this insight was only a beginning. There remained the questions of how many strands came together, whether this number was the same for every helix, whether the bases pointed toward the helical axis or away, and ultimately what were the explicit angles and coordinates of all the bonds and atoms. Such questions motivated the modeling efforts of Watson and Crick.

In their modeling, Watson and Crick restricted themselves to what they saw as chemically and biologically reasonable. Still, the breadth of possibilities was very wide. A breakthrough occurred in 1952, when Erwin Chargaff visited Cambridge and inspired Crick with a description of experiments Chargaff had published in 1947. Chargaff had observed that the proportions of the four nucleotides vary between one DNA sample and the next, but that for particular pairs of nucleotides--adenine and thymine, guanine and cytosine -- the two nucleotides are always present in equal proportions.

Watson and Crick had begun to contemplate double helical arrangements, and they saw that by reversing the directionality of one strand with respect to the other, they could provide an explanation for Chargaff's puzzling finding. This explanation was the complementary pairing of the bases, which also had the effect of ensuring that the distance between the phosphate chains did not vary along a sequence. Watson and Crick were able to discern that this distance was constant and to measure its exact value of 2 nanometers from an X-ray pattern obtained by Franklin. The same pattern also gave them the 3.4 nanometer-per-10 bp "pitch" of the helix. The pair quickly converged upon a model, which they announced before Franklin herself published any of her work.

The great assistance Watson and Crick derived from Franklin's data has become a subject of controversy, and it has angered people who believe Franklin has not received the credit due to her. The most controversial aspect is that Franklin's critical X-ray pattern was shown to Watson and Crick without Franklin's knowledge or permission. Wilkins showed it to them at his lab while Franklin was away.

Watson and Crick's model attracted great interest immediately upon its presentation. Arriving at their conclusion on February 21 1953, Watson and Crick made their first announcement on February 28. Their paper 'A Structure for Deoxyribose Nucleic Acid' was published on April 25. In an influential presentation in 1957, Crick laid out the "Central Dogma", which foretold the relationship between DNA, RNA, and proteins, and articulated the "sequence hypothesis." A critical confirmation of the replication mechanism that was implied by the double-helical structure followed in 1958 in the form of the Meselson-Stahl experiment. Work by Crick and coworkers deciphered the genetic code not long afterward. These findings represent the birth of molecular biology.

Watson, Crick, and Wilkins were awarded a Nobel Prize in 1962, by which time Franklin had died.