Genetic code expansion: how recoded genomes unlock new biology
Genetic code expansion uses whole-genome recoding to free up codons and assign them to amino acids that biology has never used, opening new routes to drug discovery and sustainable manufacturing.
Key takeaways
Codons reassigned in Syn61, the first fully synthetic recoded E. coli genome
Codons freed for reassignment to non-canonical amino acids
Waste generated per kg of peptide by traditional solid-phase synthesis
Size of the Syn61 genome, the largest fully synthetic recoded genome published at the time of creation
Genetic code expansion is the process of reprogramming how a living cell reads its DNA, so that it can build proteins using amino acids that do not exist in nature. The result is a biological manufacturing platform with capabilities not found in natural organisms.
What is genetic code expansion?
Every living cell translates DNA into protein using the same rulebook: 64 three-letter sequences called codons, mapped to just 20 amino acids. There is significant redundancy in this system. Multiple codons encode the same amino acid. Genetic code expansion exploits that redundancy. By removing codons that are surplus and reassigning them, it becomes possible to direct the cell's own ribosome to incorporate entirely new building blocks at precise positions in a protein chain.
Why biology uses 64 codons for 20 amino acids
The genetic code is degenerate: most amino acids are specified by two to six different codons. This redundancy buffers against point mutations, but it also means that several codons are available for reassignment if the organism can be engineered to no longer need them.
Three ways to expand the genetic code
The first approach, amber suppression, repurposes the TAG stop codon by adding a new transfer RNA that recognises it. This works, but it competes with the cell's own termination machinery and is limited in practice to a single codon and a single new amino acid.
The second approach introduces entirely new codon types: quadruplet codons (four letters instead of three) or codons built from unnatural DNA bases. These methods face challenges around efficiency, crosstalk with the existing translation system, and scalability.
The third approach, code compression, is fundamentally different. Instead of competing with the native system, it rewrites the genome so that the native system no longer uses certain codons at all. Those codons are then free to be assigned to new amino acids with dedicated, orthogonal translation machinery. There is no competition with native processes, and the number of available codons is limited only by how many can be freed.
Why code compression changes the game
In plain terms: code compression is like clearing lanes on a motorway. Instead of squeezing new traffic into existing lanes (amber suppression) or building an entirely separate road (quadruplet codons), you remove vehicles that are duplicating routes and open up dedicated lanes for new cargo. The existing traffic flows exactly as before, and the new lanes are exclusively available for whatever you choose to put in them.
This is why code compression is the most scalable route to genetic code expansion. It works with the cell's own translation machinery rather than against it, and it can in principle free multiple codons simultaneously for multiple new amino acids.
What is Syn61?
In 2019, the Chin laboratory at the MRC Laboratory of Molecular Biology created Syn61: a fully synthetic E. coli genome spanning 4 million base pairs. Across the entire genome, 18,214 instances of three codons (TAG, TCG, TCA) were replaced with synonymous alternatives. The result is a living bacterium that reads the genetic code differently from every other species on Earth, with three codons now free for reassignment.
Syn61 was the first organism to demonstrate whole-genome codon compression. It is the technological foundation of Constructive Bio's platform.
Why recoded genomes matter for manufacturing and biosecurity
Syn61 provides three practical advantages for biomanufacturing.
Phage resistance: because Syn61's genetic code is incompatible with natural viruses, it is resistant to bacteriophage infection. Phage contamination is a persistent and costly problem in industrial fermentation, capable of destroying entire production runs. Syn61 eliminates this risk at the genetic level.
Genetic isolation: horizontal gene transfer between Syn61 and wild organisms is blocked in both directions. Genes entering Syn61 from the environment will be misread, and Syn61's own genetic material is non-functional in natural organisms. This creates a built-in biocontainment layer without external safeguards.
New chemistry: the freed codons are available for encoding non-canonical amino acids (ncAAs), enabling the production of peptides and proteins with chemical properties not found in nature. This is directly relevant to drug discovery (site-specific antibody-drug conjugates, protease-resistant peptides), industrial enzymes (enhanced catalytic properties), and sustainable manufacturing.
Why Syn61 matters commercially
Traditional peptide manufacturing relies on solid-phase peptide synthesis (SPPS), a chemical process estimated to generate up to 13,000 kg of waste per kg of peptide produced, much of it organic solvents. Fermentation-based production using recoded cells offers a route to replacing this process for many peptide classes, with lower cost, higher fidelity, and dramatically less waste.
The technology rests on published research in Nature, Science, and Nature Chemistry spanning 2019 to 2024, including whole-genome synthesis, sense codon reassignment, and the incorporation of multiple distinct ncAAs into single proteins.
Frequently asked questions
What is a recoded genome? A recoded genome is one in which redundant codons have been systematically replaced with synonymous alternatives across the entire DNA sequence. This frees specific codons for reassignment to new functions, such as encoding non-canonical amino acids.
Why does recoding prevent phage infection? Bacteriophages rely on the host cell's translation machinery to reproduce. When the host reads the genetic code differently, viral genes are mistranslated, producing non-functional proteins. The virus cannot replicate.
Why does recoding help ncAA incorporation? In a standard organism, all 64 codons are in use. There is no "spare" codon to assign to a new amino acid without competing with an existing one. Recoding removes specific codons from the genome entirely, creating dedicated channels for new amino acids with no competition from native translation.
Related peer-reviewed research
Syn61: Total Synthesis of E. coli with a Fully Recoded Genome - The Foundation of Constructive Bio's Platform
Fredens, J., Wang et al. — Nature 569(7757), 514–518 (2019)
Syn57: E. coli Engineered with the Most Compressed Genetic Code Ever Created, Freeing Seven Codons for ncAA Incorporation
Robertson, W.E., Rehm et al. — Science 390, eady4368 (2025)
Sense Codon Reassignment Enables Virus-Resistant Production Strains and Encoded Non-Natural Polymer Synthesis
Robertson, W.E., Funke et al. — Science 372(6546), 1057–1062 (2021)
Genetic Code-Locking Gives Recoded Organisms Stable Virus Resistance, Eliminating a Key Biomanufacturing Risk
Zürcher, J.F., Dickson et al. — Biochemistry 64, 3093 (2025)
Refactored Genetic Codes Create Bidirectional Genetic Isolation for Biocontained Industrial Production Organisms
Zürcher, J.F., Robertson et al. — Science 378, 516 (2022)

Get the full white paper (PDF)
Free, with email registration
Download started!
Explore our platform
See how this research translates into next-generation peptide therapeutics.