1.1: DNA, the Instruction Manual
Every cell in your body, roughly thirty-seven trillion of them, contains a complete copy of your genome, the full set of genetic instructions that make you who you are. That genome is written in a molecule called DNA. Stretched out, the DNA from a single human cell would measure about two meters. The entire genome contains approximately three billion individual letters of genetic code. Printed in standard book format, it would fill roughly two hundred phone books. All of it is packed into the nucleus of each cell, a compartment too small to see without a microscope.
The language of DNA uses just four chemical letters: adenine (A), thymine (T), guanine (G), and cytosine (C). These letters pair in a strict way: A always with T, G always with C, forming the rungs of the famous double helix. This structure was described in 1953 by James Watson and Francis Crick, building critically on the X-ray crystallography work of Rosalind Franklin, whose contribution was not fully recognized during her lifetime. Franklin's "Photo 51" provided the key evidence for the helical structure, and her meticulous experimental work was essential to the discovery that launched modern molecular biology.
Within those three billion letters, specific stretches serve as instructions for building the molecular machines that keep you alive. These stretches are called genes, and humans have between twenty thousand and twenty-five thousand of them. Genes are organized on structures called chromosomes. Humans have twenty-three pairs, forty-six total. Genes make up only about one and a half percent of your total DNA. The rest was once dismissed as "junk DNA," but we now know much of it plays regulatory roles, controlling when, where, and how much of each gene is activated. It is the operating system that decides which programs run and when.
If you have a computing background, the analogy is direct: your DNA is a hard drive storing all the information. Your genes are individual programs. But a program on a hard drive does nothing until you run it. A gene does nothing until the cell reads it and translates it into a functional product. That process of running the program is called the Central Dogma of molecular biology, and it is the key to understanding everything that follows, from how cancer works to how vaccines fight it.
The master blueprint, stored in the nucleus. DNA holds all the instructions but never leaves the vault.
RNA polymerase copies the gene into mRNA. This temporary copy carries the instructions out of the nucleus to the ribosomes.
Ribosomes read the mRNA three letters at a time (codons), assembling amino acids into a protein chain. Shape determines function.
1.2: The Central Dogma — DNA to RNA to Protein
In 1958, Francis Crick articulated what he called the Central Dogma of molecular biology: information flows from DNA to mRNA to protein. A three-step process. Once you understand it, the logic of mRNA vaccines becomes almost obvious.
DNA is the master blueprint, locked in a vault (the cell's nucleus) for safekeeping. You never take the master blueprint to the construction site. Instead, you make a copy and send that to the workers. The copy is RNA. The finished product the workers build is protein.
The first step is called transcription. A molecular machine called RNA polymerase reads a gene on the DNA and produces a copy in a closely related molecule called messenger RNA, or mRNA. The mRNA is a single-stranded, temporary copy of the gene's instructions. This is the same type of molecule used in mRNA vaccines. Injecting mRNA gives cells a copy of instructions for building a specific protein. The cell's own machinery does the rest.
The mRNA copy is written in three-letter "words" called codons. There are sixty-four possible three-letter combinations using the four RNA letters (A, U, G, and C; RNA uses uracil instead of thymine), and these sixty-four codons map to just twenty different amino acids, plus a few stop signals. The genetic code is redundant: multiple codons can specify the same amino acid. Leucine, for example, has six. This redundancy looks like sloppy engineering, but it is useful. In vaccine design, scientists exploit it through codon optimization, choosing codons that the host's cellular machinery translates most efficiently, without changing the protein produced.
The second step is translation, happening on molecular machines called ribosomes. Ribosomes read the mRNA three letters at a time, and for each codon, a small adapter molecule called transfer RNA (tRNA) delivers the corresponding amino acid. The amino acids are linked one by one into a chain called a polypeptide. A typical protein is three hundred to five hundred amino acids long. When the ribosome reaches a stop codon, the chain is released and folds into the three-dimensional shape that gives the protein its function.
The connection to cancer and Rosie's story: if a somatic mutation in the DNA changes a gene's code, the mRNA copy carries that change, and the ribosome builds a slightly different protein. Sometimes the difference is harmless. Sometimes it produces a protein that does not fold correctly, does not function properly, or looks foreign to the immune system. That is the molecular basis of cancer, and it is the opening personalized cancer vaccines exploit.
Try It: Codon Translator
Type a DNA sequence (any length divisible by 3). Try: ATGAAATTTGAATAA
1.3: Cancer — When the Code Breaks
Cancer is a disease of the genome. It occurs when mutations accumulate in a cell's DNA in a way that overrides normal growth controls. Every cell has two types of safeguard genes: oncogenes, which promote cell growth (the accelerator pedal), and tumor suppressor genes, which restrain it (the brakes). Cancer typically requires mutations in both. An oncogene gets stuck "on," and a tumor suppressor like TP53 (the "guardian of the genome") gets knocked out. The result is a cell that grows relentlessly, ignoring signals to stop.
Not all mutations in a cancer cell are equal. A typical tumor might harbor hundreds or thousands of mutations, but only a small handful, usually two to eight, actually drive the cancer. These are driver mutations. The rest are passenger mutations, accumulated randomly as the cancer cell divides. The distinction matters for treatment: you want to target the drivers, or at least mutations that produce proteins the immune system can recognize.
Different cancer types carry different numbers of mutations, a measure called tumor mutational burden, or TMB. Melanoma tends to have a high burden (ten to fifty mutations per megabase of DNA) because ultraviolet radiation is a potent mutagen. Pancreatic cancer typically has a low burden of one to five per megabase. More mutations generally mean more potential immune targets. Cancers with high TMB tend to respond better to immune-based treatments because they produce more abnormal proteins the immune system can learn to recognize.
Those abnormal proteins are called neoantigens ("new antigens"), and they are the foundation of personalized cancer vaccines. A somatic mutation changes a gene, which changes a protein, and the cell chops that abnormal protein into fragments and displays them on its surface like a flag. To the immune system, that flag reads: something is wrong in here. A personalized cancer vaccine teaches the immune system to recognize those flags and destroy any cell displaying them. That is what Conyngham set out to do for Rosie: identify the flags her cancer was flying, and train her immune system to attack them.
Key Takeaways
- DNA is a 4-letter code (A, T, G, C); 3 billion letters per human cell, packed into every nucleus.
- Information flows one way: DNA → mRNA → Protein — the Central Dogma. mRNA vaccines exploit step one by delivering instructions directly.
- Cancer is a disease of accumulated DNA mutations that override growth controls (accelerator stuck on, brakes knocked out).
- Neoantigens are the abnormal protein fragments a cancer cell displays — the targets personalized vaccines are designed to attack.