Nature creates diversity by recombining DNA pieces to retrieve new sequences: homologous recombination, exon shuffling, V(D)J recombination in B cells or by point mutations as in somatic hypermutation, mutagens, errors during replication etc. Directed evolution has evolved many protocols to generate diversity at the gene level . There are two classes of mutagenesis methods in directed evolution (as also in nature): recombining properties coded in similar genes (DNA shuffling methods, StEP, RACHITT, SCOPE etc.) or point mutation methods that exchange 1-3 amino acids at a time in a given DNA sequence (SeSaM, error-prone PCR (epPCR) methods, mutator strain methods, chemical mutagenesis methods etc.).
SeSaM (Sequence Saturation Mutagenesis) is a four steps patented method (see figure on top left) which represents a breakthrough in directed evolution. The 4 steps are:
- Step 1: A PCR to incorporate phosphorothioate nucleotides into the gene of interest followed by cleavage of the phosphorothioate bonds to create evenly distributed fragments of the gene.
- Step 2: Elongation of fragments in Step 1 with universal bases at 3’-OH terminal
- Step 3: Fragments with the universal bases are elongated to their full length using the full length gene template
- Step 4: Mutations are generated in a final PCR to create mutant libraries that are ready for cloning
At the moment, we use as universal base dPTP with which we target all the G's and A's in both the upper and lower strands forming four libraries: G-forward, A-forward, G-reverse and A-reverse. In these libraries, all the G's, A's, C's and T's in the coding strand are eqully targetted. After the last step, the four libraries are pooled together for a complete SeSaM library.
It is known that the quality of a mutant library is crucial for the success of a directed evolution experiment and decisive for the timeline of a directed evolution project. Commonly used mutagenesis methodologies suffer from three fundamental problems:
- Bias of DNA polymerase or mutagenic agent that results in mutagenic 'hot spots'
- Bias of DNA polymerases towards transitions, which lead to increased silent mutations
- Lack of subsequent mutations which leads to decreased diversity generated.
SeSaM: libraries with no 'hot spots'
'Hot spots' are a result of the preference of DNA polymerases for certain nucleotide exchanges. Moreover, due to PCR amplification used in epPCR methods, the nucleotides exchanged during the first cycles are overly amplified in the subsequent cycles leading to overrepresentation of these exchanges in the resulting library. In turn, overrepresentation of gene variants in a library leads to need for increase screening just to be sure that the less represented variants are also studied.
The 'hot spots' problem was tried to be avoided by combining enzymatic methods with chemical approaches, but these are also known to face limitations. For example, Cassette mutagenesis  is a powerful tool for rational design since it can randomize up to 8 amino acids. However, this method is limited by the low number of amino acids that can be randomized. RID (random insertion deletion) is one of the few methods available that avoids the mutagenic 'hot spots' . However it involves 5 ligation steps and hence is very laborious. Other methods that overcome the problem of the 'hot spots' are RAISE (Random Insertional-Deletional Strand Exchange) , epPCR with unbalanced dNTP and Mn2+ concentrations followed by PCR amplification in presence of base analogue 8-hydroxy-dGTP, mutagenesis of mRNA-dependent RNA polymerase 'Q-beta replicase' coupled with ribosome display technology to carry out mutagenesis and screening in one reaction and finally, PCR amplification employing error prone human X and Y family polymerases that show complementing biases on the expense of increased insertion and deletion frequency. However, although reducing the problems of classical methods, these novel methods have as disadvantages either uncontrollable mutation frequency or impossibility of introducing subsequent mutations.
SeSaM was first developed as a mutagenesis method that can overcome the limitations caused by the mutational spectra bias of epPCR methods . This goal was achieved by targeting each nucleotide species, and then exchanging it in a controlled manner. The result of the first milestone in developing SeSaM was a diversity generation method with reduced ‘hotspots’ as shown by the uniform distribution of mutations in the SeSaM library of the EGFP (enhanced green fluorescent protein) gene (right figure).
SeSaM libraries have no 'hotspots' since all the bases are equally exchanged to universal bases and the nucleotide exchanges are not amplified thoughout the protocol.
SeSaM: transversions enriched libraries
In a theoretical paper , it was shown that a transitions (Ts) biased library leads to half the diversity generated by a transversions (Tv) biased library i.e. in a Ts biased library only 11% of the effective protein space can be covered, while in a Tv biased library 21.5% of the effective protein space is attained. Moreover, the Ts biased library one will find 34.9% silent mutations while in the Tv library there will be only 15.3% silent mutations. In addition, the average number of unique amino acid substitutions per protein position will be 2.2 in a Ts library, while this number is as high as 4.7 for the Tv library.
Driven by this theoretical assesment and the desire to have a diversity generation method complementary to the current transitions (Ts) biased epPCR methods, the next milestone in the development of SeSaM was attained. Using the fact that SeSaM regulates the mutational spectra through a universal base, the introduction of a transversion (Tv) biased mutational spectrum was reported for SeSaM-Tv+ .
SeSaM: unseen diversity through consecutive nucleotide exchanges
The amount of subsequent mutations is an aspect often disregarded when the quality of a mutagenesis method is judged. However, it must be noted that when one nucleotide exchange per codon occurs, maximum 9 amino acid substitutions are allowed. On the other hand, for two nucleotide exchanges in a codon, there are 45 possibilities of exchanging the amino acid to the other amino acids. This is summarized in left figure for TTA (Leu) codon for one nucleotide exchange per codon and in the figure below for two nucleotides exchanges in the same codon.
With one nucleotide exchange per codon (figure above), there are four beneficial amino acid exchanges (Phe, Ser, Val and Ile) out of 19 possible ones. However, if two nucleotide exchanges per codon would be achieved (left figure), 16 out of 19 useful amino acid substitutions would be possible.
In the figure below, the same analysis done for the TTA codon is done for all the codons and averaged for each amino acid in the form of a substitution pattern. The figure was drawn according to the calculations done in  using the layout and concept from .
In the left panel the substitution pattern generated by a non-biased mutagenesis method that exchanged one nucleotide per codon is compared with the substitution pattern for a mutagenesis method that generates two nucleotide exchanges per codon (right panel). The Y-axis shows the original amino acid species and the X-axis shows the substitution pattern. The substitution pattern for 20 amino acid species is indicated from light grey (lowest probability) to dark green (highest probability) on a gradient scale. Amino acid substitutions that do not occur are colored in white. It can be easily seen in this figure that one nucleotide exchange per codon will cover far less of the protein space (39.2%) compared to two mutations per codon (92.6%). In addition, in order to visualize what will happen for a real enzyme, the figure below shows the number of amino acid substitutions for each amino acid position in IgG B12.
A mutagenesis method that will produce more than one nucleotide exchange per codon will bring improvements to classically available diversity generation methods. This is why, recently, new methods were developed in order to surpass the problem of absence of subsequent mutations. These include the TriNEx (Trinucleotide exchange) method  and TIM (Transposon Integration mediated Mutagenesis) . However, these methods do not have a controllable mutation frequency and are not technically simple and robust. Moreover, TIM shows sequence-dependent integration preferences resulting in unequal distribution of mutations within the targeted sequence.
Taking into consideration the theoretical assesment of protein space expansion due to two nucleotide exchanges per codon, SeSaM method was further optimized to encompass consecutive nucleotide exchanges. During the development of SeSaM-Tv+, consecutive nucleotide exchanges in a codon were observed . Next, the SeSaM protocol was optimized towards increase in the number of consecutive nucleotide exchanges (unpublished data).
To conclude, SeSaM's main advantages are:
- No 'hotspots'
- Unique control over mutational bias
- Increased consecutive mutations leading to unexplored protein variants diversity
- Increased transversions leading to complementary mutations to the conventional epPCR methods
- Simple and fast library generation
Patent file: US7790374, patent protection established in over 13 countries
1 - Wong, TS, Zhurina, D, Schwaneberg, U (2005) The diversity challenge in directed protein evolution. Comb Chem High Throughput Screen, 9, p271-88.
2 - Kegler-Ebo DM, Docktor CM, DiMaio D. (1994) Codon cassette mutagenesis: a general method to insert or replace individual codons by using universal mutagenic cassettes. Nucleic Acids Res, 22, p1593-9.
3 - Shivange AV, Marienhagen J, Mundhada H, Schenk A, Schwaneberg U (2009) Advances in generating functional diversity for directed protein evolution. Curr Opin Chem Biol, 13, p19-25.
4 -Wong TS, Roccatano D, Schwaneberg U (2007) Steering directed protein evolution: strategies to manage combinatorial complexity of mutant libraries. Environ Microbiol, 9, p2645-59.
5 - Wong, TS, Tee, KL, Hauer, B, and Schwaneberg, U (2004) Sequence Saturation Mutagenesis (SeSaM): A novel method for Directed Evolution. Nucleic Acids Res, 32, p1-8.
6- Wong TS, Roccatano D, Schwaneberg U (2007) Are transversion mutations better? A Mutagenesis Assistant Program analysis on P450 BM-3 heme domain. Biotechnol J., 2(1), p133-42.
7 - Wong TS, Roccatano D, Loakes D, Tee KL, Schenk A, Hauer B, Schwaneberg U (2008) Transversion-enriched sequence saturation mutagenesis (SeSaM-Tv+): a random mutagenesis method with consecutive nucleotide exchanges that complements the bias of error-prone PCR. Biotechnol J, 3, p74-82.
8 - Wong TS, Roccatano D, Schwaneberg U (2007) Challenges of the genetic code for exploring sequence space in directed protein evolution. Biocatalysis and biotransformation, 25, p229-241.
9 - Wong, TS, Roccatano, D, Zacharias, M, and Schwaneberg, U (2006) A Statistical Analysis of Random Mutagenesis Methods Used for Directed Protein Evolution. Journal of Molecular Biology, 355, p858-71.
10-Baldwin AJ, Busse K, Simm AM, Jones DD (2008) Expanded molecular diversity generation during directed evolution by trinucleotide exchange (TriNEx). Nucleic Acids Res, 36 :e77
11-Hoeller BM, Reiter B, Abad S, Graze I, Glieder A (2008) Random tag insertions by Transposon Integration mediated Mutagenesis (TIM). J Microbiol Methods, 75, p251-257.