What Is DNA? Structure, Function, and Why It Matters

Updated May 2026
DNA (deoxyribonucleic acid) is the molecule that stores genetic instructions in every living cell. It consists of two strands twisted into a double helix, with the sequence of chemical bases along those strands encoding the information needed to build proteins, regulate cellular processes, and pass traits from one generation to the next. DNA is found in the nucleus of eukaryotic cells and freely in the cytoplasm of prokaryotes.

The Chemical Structure of DNA

DNA is a polymer made of repeating subunits called nucleotides. Each nucleotide has three parts: a five-carbon sugar called deoxyribose, a phosphate group, and a nitrogenous base. The sugar and phosphate groups form the structural backbone of each DNA strand, connected by covalent phosphodiester bonds. The nitrogenous bases extend inward from the backbone and pair with bases on the opposite strand through hydrogen bonding.

There are four nitrogenous bases in DNA: adenine (A), thymine (T), guanine (G), and cytosine (C). These bases fall into two chemical categories. Adenine and guanine are purines, which have a two-ring structure. Thymine and cytosine are pyrimidines, with a single-ring structure. Base pairing always occurs between a purine and a pyrimidine: adenine pairs with thymine through two hydrogen bonds, and guanine pairs with cytosine through three hydrogen bonds.

The two strands of a DNA molecule run in opposite directions, described as antiparallel. One strand runs in the 5-prime to 3-prime direction while the complementary strand runs 3-prime to 5-prime. This directionality is determined by the orientation of the sugar molecules in the backbone and has important consequences for how DNA is replicated and read by cellular machinery.

The double helix structure gives DNA remarkable stability. The hydrogen bonds between base pairs hold the two strands together, while stacking interactions between adjacent bases provide additional stability. The hydrophobic bases are tucked inside the helix away from water, with the hydrophilic sugar-phosphate backbones facing outward. This arrangement protects the genetic information stored in the base sequence from chemical damage.

How DNA Stores Genetic Information

The genetic information in DNA is encoded in the sequence of bases along the strand. Just as letters of an alphabet can be arranged to spell different words, the four DNA bases can be arranged in countless combinations to encode different instructions. The human genome contains approximately 3.2 billion base pairs, providing an enormous capacity for information storage.

Genes are specific segments of DNA that contain instructions for building proteins or functional RNA molecules. A typical human gene ranges from a few thousand to over two million base pairs in length. The coding sequence within a gene is read in groups of three bases called codons, with each codon specifying a particular amino acid. The sequence of codons determines the sequence of amino acids in the resulting protein.

Not all DNA encodes proteins. In humans, only about 1.5 percent of the genome consists of protein-coding sequences. The rest includes regulatory elements that control when and where genes are expressed, structural sequences that help organize chromosomes, repetitive sequences of various types, and regions whose functions are still being studied. This non-coding DNA was once dismissively called junk DNA, but research continues to reveal functional roles for much of it.

DNA also stores information through chemical modifications that do not change the base sequence itself. Methylation, the addition of a methyl group to cytosine bases, is the most common such modification in mammals. Methylation patterns can silence genes without altering their sequence, providing an additional layer of information storage called epigenetic marking. These marks can be inherited through cell division and sometimes across generations.

DNA in Cells: Packaging and Organization

In eukaryotic cells, DNA is contained within the nucleus and organized into structures called chromosomes. Human cells contain 46 chromosomes (23 pairs), with each chromosome consisting of a single continuous DNA molecule wrapped around proteins called histones. The DNA-histone complex is called chromatin, and its level of compaction varies depending on whether genes in that region are active or silent.

The packaging of DNA into chromosomes solves a fundamental physical problem. The total DNA in a single human cell measures about two meters when stretched out, yet it must fit inside a nucleus roughly six micrometers in diameter. Histones act as spools around which DNA winds, compacting it by a factor of about seven. Further levels of folding and looping compact the chromatin by additional orders of magnitude, ultimately producing the dense chromosome structures visible during cell division.

Mitochondria and chloroplasts also contain their own DNA molecules, separate from the nuclear genome. Mitochondrial DNA in humans is a circular molecule of about 16,500 base pairs encoding 37 genes essential for energy production. Unlike nuclear DNA, mitochondrial DNA is inherited exclusively from the mother, making it a useful tool for tracing maternal lineages in genetics research and forensic science.

DNA Replication: Copying the Genetic Instructions

Before a cell divides, it must copy its entire DNA so that each daughter cell receives a complete genome. This process, called DNA replication, begins when the enzyme helicase unwinds the double helix, separating the two strands. Each strand then serves as a template for synthesizing a new complementary strand, resulting in two identical DNA molecules from one original.

The enzyme DNA polymerase carries out the actual synthesis of new DNA strands. It reads the template strand and adds complementary nucleotides one at a time to the growing new strand. DNA polymerase also has proofreading ability, detecting and correcting most errors immediately after they occur. Additional repair enzymes scan newly replicated DNA for any remaining mistakes, achieving an overall error rate of roughly one mistake per billion nucleotides copied.

Replication in human cells proceeds simultaneously from thousands of starting points (origins of replication) along each chromosome. This parallel approach allows the entire 6.4 billion base pairs of the diploid human genome to be copied in about eight hours. Without multiple origins, replication from a single starting point would require weeks to complete.

Why DNA Matters: From Medicine to Evolution

Understanding DNA has transformed medicine. Genetic testing can identify disease-causing mutations before symptoms appear, enabling early intervention. Pharmacogenomics uses DNA information to predict how patients will respond to medications, allowing personalized treatment plans. Gene therapy introduces functional DNA into cells to treat diseases caused by faulty genes, with several approved treatments now available for conditions like spinal muscular atrophy and inherited blindness.

DNA analysis has revolutionized forensic science. Because every person (except identical twins) has a unique DNA sequence, genetic profiles can identify individuals with near-certainty from biological samples found at crime scenes. DNA evidence has both convicted criminals and exonerated wrongly accused individuals, fundamentally changing the justice system.

In evolutionary biology, DNA provides a molecular record of evolutionary relationships. By comparing DNA sequences between species, scientists can reconstruct the tree of life and estimate when different lineages diverged. Ancient DNA extracted from fossils allows direct comparison with modern genomes, revealing details about extinct species and past human populations that would be impossible to determine from physical remains alone.

Key Takeaway

DNA is the universal information storage molecule of life, using a simple four-letter chemical alphabet to encode all the instructions needed to build and maintain an organism. Its double helix structure enables both stable information storage and accurate copying during cell division.