Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Index

Table of Contents
minLevel1
maxLevel

...

2
outlinefalse
styledefault
excludeIndex
typelist
printabletrue

General

For this exercise, you need either

...

PyMol or

...

YASARA installed that has the 3DM plugin. If you don't have Yasara or Pymol or you are missing the 3DM functionality, please consult the installation instructions. Before you start this exercise, make sure you have the latest version of Yasara or Pymol installed.

Login at 3DM with your 3DM account. If you don't have a 3DM account you can request one via the "get 3DM" tab.

After entering the login details

...

Image Removed

Panel
borderColor#C8E6C9
bgColor#C8E6C9

At the starting page of each 3DM database, you see the 3DM data cycle. The icons in the circle represent links to the most important 3DM options. These options are also available on the left.

Introduction

...

on the login page, you land on the 3DM dashboard page. Here, you can find an overview of all available systems. In the 3DM COURSE tab, click on the Phosphoenolpyruvate Mutase / Isocitrate Lyase system. This is the 3DM system we will be working with during this course.

Info

In case you have any questions about this course, please get in touch with our support team via support@bio-prodict.com.

Introduction

Fungi can be pathogenic to plants and animals. It is known that the secretion of oxalate by fungi is a commonly used strategy for their pathogenicity. Oxalate is toxic and can form crystals that demolish the cell wall of the host. The oxalate is produced from oxaloacetate, catalyzed by the enzyme oxaloacetate hydrolase (OAH). This is the reaction:

Image Removed

...

We have generated a 3DM for the corresponding protein family. OAH falls in the Phosphoenolpyruvate mutase/Isocitrate lyase superfamily.

...

The OAH of niger is the best characterized OAH protein. This is the sequence:

Code Block
>G3Y473
			MKVDTPDSASTISMTNTITITVEQDGIYEINGARQEPVVNLNMVTGASKLRKQLRETNEL
			LVCPGVYDGLSARIAINLGFKGMYMTGAGTTASRLGMADLGLAHIYDMKTNAEMIANLDP
			YGPPLIADMDTGYGGPLMVARSVQQYIQAGVAGFHIEDQIQNKRCGHLAGKRVVTMDEYL
			TRIRAAKLTKDRLRSDIVLIARTDALQQHGYDECIRRLKAARDLGADVGLLEGFTSKEMA
			RRCVQDLAPWPLLLNMVENGAGPVISVDEAREMGFRIMIFSFACITPAYMGITAALERLK
			KDGVVGLPEGMGPKKLFEVCGLMDSVRVDTEAGGDGFANGV

For each protein in the 3DM database, there is a

...

protein information

...

page that contains more detailed information.

Infopanel
iconfalse
titlepanelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 1

: Find the protein information page of the sequence above using the search option of 3DM. What is the core identity of this protein?

Expand
titleAnswer

There are multiple ways to find this protein page. In the

quick

search

(you can find box this just above the green bar in 3DM) you can use "G3Y473” or you can simply search for G3Y473 in the keyword search tab

bar in the Analyze a protein window on the System’s start page you can search for an accession, identifier or keyword. In this case, you would enter "G3Y473”. Alternatively, you can go to Search > Proteins by keyword, or Proteins by BLAST and enter the protein identifier or sequence. G3Y473 (G3Y473_ASPNA) is the top hit and has a core identity of 77%.

On the protein information pages, you can find a couple of different tabs. Have a quick look at what you can find in each tab.

Subsets

3DM offers several ways to select a subset of sequences. Once a subset is selected a mini 3DM can be generated for this subset.

...

As we have demonstrated in Introduction: 3DM applied to the nuclear receptors, all 3DM functionalities, such as the correlated mutations analysis, are regenerated and can separately be

...

inspected. The data of a subset can also be compared to the data of the full set of sequences or with other previously defined subsets.

With the search option

...

you can create a subset

...

that contains the proteins that are available in this 3DM system for fungi of which it is known that they can produce oxalate. We have created such a subset and called it “oxalate producers”. You can find it under Subsets inside the 3DM System. The subset contains 33 proteins from the following species:

Code Block
				Aspergillus clavatus
				Neosartorya fischeri
				Penicillium chrysogenum
				Penicillium marneffei
				Talaromyces stipitatus
				Sclerotinia sclerotiorum
				Aspergillus niger
				Sclerotium cepivorum
				Aspergillus terreus
				Aspergillus fumigatus
				Botryotinia fuckeliana
Do you understand how we generated this subset (e.g. which search options we combined in the subset window → don't actually do this yourself, guest accounts cannot make subsets)
Infopanel
iconfalse
titleQuestion 2
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 2: How would you create such a subset, using the 3DM System?

Expand
titleAnswer

In the keyword search tab of the search option, you can select species. Here you can type the species names. We have separately searched for each of the species and the resulting proteins were added to the subset window by clicking the + signs of the subset window. Try to search yourself for Aspergillus clavatus in the species search options. You will find two proteins (A1CFP3 and A1CMM8). These are the first two proteins of the subset. You can get a list of the proteins that are in a subset by clicking in the subset window on the number (in this case 33) that indicates how many proteins are in the subset.

Info
iconfalse
titleQuestion 3

Do you understand why we generated this subset? subset name which is displayed in green.

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 3: What do you think we can learn from this subset?

Expand
titleAnswer

A subset is a mini 3DM made of only the sequences that are defined in the subset. All features and data types are re-calculated and can be compared to data in other subsets of or to the whole superfamily. Here, we want to find the proteins/residues responsible for oxalate production in oxalate-producing fungi. We therefore try to find things (e.g. residues) the residues that are specific to the subset.

...

Navigate to the Alignment statistics page. Then, change the subset at the top of the page from Full dataset to oxalate producers and notice how the graphs change (Figure 2).

...

3DM always generates an extra histogram for each subset that shows

...

which residues

...

are specifically conserved in the selected subset (the histogram called

...

Subset specific conserved residues

...

). The highest scoring residues in our subset are around 3D positions 157.

...

It is important to

...

realise that these are positions that are not just

...

conserved in this subset of oxalate producing fungi,

...

the corresponding residues are also absent from the rest of the sequences in the superfamily. In other words, these residues are specific for this subset.

You can see this by comparing this plot with the amino Amino acid conservation plot of the new subset. Use the "custom plot" tab and select your subset from In the CUSTOM PLOTS tab, select the oxalate producers subset in the left box and from in the right box "amino select Amino acid conservation " and "subset Subset specific conserved residues ".

...

iconfalse
titleQuestion 4

(Figure 3). Then, click Generate.

...

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 4: How many positions are 100% conserved? And how How many of those are specific for the oxalate-producing fungi?

Expand
titleAnswer

You can put the slider bar of the amino acid conservation plot on 100%. You will see that there are

47

53 positions 100% conserved in the oxalate producers subset (

see figure

Figure 4).

Image Removed

This figure shows what you should have selected. Here the conservation cut-off was set at 100%. Clearly,

Scrolling down you can see in the Subset specific conserved residues plot that the subset-specific conserved residues are found mainly around position 157.

image-20241023-101329.pngImage Added

...

Navigate to the

...

Alignment page and click on the consensus sequence at position 157.

Infopanel
iconfalse
titlepanelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 5

: What is the most conserved residue at this position in the "oxalate the oxalate producers " subset?

Expand
titleAnswer

The most conserved residue is S157 → 100% in the subset.

Info
iconfalse
titleQuestion 6

And what (12 sequences containing S, 1 sequence has a gap at this position).

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 6: What is the percentage of this residue in the full alignment (hint: Use the subset menu in the header to change between your subset and the full dataset).?

Expand
titleAnswer

S157 → 1.06% in the full alignment.

Info
iconfalse
titleQuestion 7

So what To see this, first change the subset back to Full dataset at the top of the page, as in Figure 2. In the graph you can now see that only 0.91% of the sequences in the full alignment has S157.

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 7: What is the difference? Do you understand the "subset Subset specific conserved" plot from the previous question?

Expand
titleAnswer

Difference = 9899.94%09%. In the "subset Subset specific conserved" plot the number is 99,38 .54% (you can see this by putting your mouse over the peak of the subset specific conserved plot at position 157). The difference between these numbers comes from the fact that in the full alignment the serine's serines of the oxalate producers subset are included. The "subset Subset specific conserved residues" plot calculates the difference between the conservation in the subset minus the conservation of the full set, but then without the sequences of the subset.

Take home message: the
Panel
borderColor#C8E6C9
bgColor#C8E6C9
Info

The data you are looking at is always depending on the subset tab that is selected.

...

Go the Correlated mutations

...

page via the menu on the left. Make sure you have

...

Full Dataset

...

selected as Subset at the top of your 3DM page.

Correlated mutations calculated for a superfamily alignment often reflect positions that are important for specificity, because superfamily alignments contain enzymes with different specificities

...

iconfalse
titleQuestion 8

.

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 8: Explain this concept.

Expand
titleAnswer

The proteins in a superfamily usually form groups of different specificities. Within a group, the residues that are important for specificity are conserved, but between the groups , they mutate. Since they all mutate simultaneously between the groups they result in a correlated mutation network. Note: which protein feature is behind a correlated mutation network heavily depends on the input alignment. If you make a subset of enzymes that all have the same specificity the correlated mutation will, of course, not reflect changes in specificity and thus the network will not be composed of specificity hotspots. The concept of how to choose the input alignment is explained in more detail later in this practicalcourse.

...

Now open the CORRELATION HEATMAP tab. The heatmap shows the alignment positions of which the residues mutate simultaneously (

...

correlated mutation).

Infopanel
iconfalse
titleQuestion 9
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 9: Which position is the highest correlating?

Expand
titleAnswer

Position 157 is the highest correlating position. Note that you can click on the heatmap. This Doing so will lead to open plots showing that show the amino acid distribution of the two corresponding positions. Those plots show the co-occurrence of amino acids. This data can be use used to see if it might be better to make certain double mutants instead of single mutants.

...

Go back to the CORRELATION NETWORKS tab. Enter the keyword “specificity” in the Literature & Mutations search box. This will select mutations from the literature that affect specificity, reported in any of the proteins of the superfamily.

Infopanel
iconfalse
titleQuestion 10

Do you see why it is so handy to have the 3D numbers in the network?

Expand
titleAnswer

It enables to the plot of any other data type of which the 3D number is known on the network. This is one of the strong features of 3DM. All data and all tools are connected via 3D numbers.

Which residue positions are reported to
Info
iconfalse
titleQuestion 11
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 10: Which residue positions are reported to affect specificity and which one is the most published position in relation to specificity?

Expand
titleAnswer

Giving the keyword specificity in the search box results in 5 papers of position 157.

Info
iconfalse
titleQuestion 12

3DM calculates an enrichment score for a given keyword. What do you think this enrichment score meansPosition 157, with 2 mutations. You can extract this from the search box results.

Go back to the Alignment statistics page. Now compare the ligand contact plot (in this case these will be enzyme inhibitors) of the full dataset with the correlation plot of the full dataset.

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 11: If you would add these plots, what would be the highest scoring position? What does this mean?

Expand
titleAnswer

The enrichment score is the factor that shows how many more times mutations related to the keyword are found in the network compared to positions outside the network. The enrichment score for specificity in the OAH network is 7.39. This means that there are 7.39 times more mutations affecting specificity published at positions in the network. Thus it is likely that specificity is the evolutionary pressure that caused the positions in the network to mutate simultaneously.

Note that an enrichment score of >4 or 5 normally is significant.

  • Go back to the alignment statistics page.
  • Now compare the ligand contact plot (in this case these will be enzyme inhibitors) of the full dataset with the correlation plot of the full dataset
Info
iconfalse
titleQuestion 13

If you would add these plots what is the highest scoring position? What does this mean?

Note: Protein residues can, sort of, be divided into three groups.

1. The conserved

Position 157 scores highest when both numbers are added. Positions that both make a contact with a ligand but also show correlated mutation behaviour are likely hotspots for specificity. Some positions, like 116, make a lot of contacts with ligands but do not show correlated mutation behaviour. This is because position 116 is a conserved position; it is important for the general function of the protein (the reaction).

Expand
titleAnswer

Position 157 scores highest when both numbers are added. Positions that both make a contact with a ligand but also show correlated mutation behavior are likely hotspots for specificity.

Panel
borderColor#C8E6C9
bgColor#C8E6C9

Note: This is how you have to think when you use 3DM. If two independent data types (here ligand contact data and correlated mutation data) show a correlation then you have probably found something meaningful. When you find a correlation between data types like the above your alarm bells should start ringing. When you use 3DM you should think like this: What is my biological question? How can I use 3DM to answer this question? To answer this you should try to answer this:

  • 1. What subset(s) should I generate (sometimes you can simply use the full database)?
  • 2. What data do I need to compare?

Some positions, like 116 make a lot of contacts with ligands but do not show correlated mutation behavior. Do you understand why this is? Position 116 is a conserved position. This position is important for the general function of the protein (the reaction) and not for the specific function.

Panel
borderColor#C8E6C9
bgColor#C8E6C9

When you find a correlation between data types like above, your alarm bells should start ringing. Always keep your biological question in mind when you are doing research, and then think about how 3DM can help you answer that question. For example:

  • Can I simply use the full database, or should I create any subsets?

  • What data do I need to compare?

In that process, try to make efficient use of you knowledge about:

  1. Conserved residues (they perform the general function of

...

  1. proteins).

...

  1. Correlated mutations (they perform the specific function of the proteins).

...

  1. Highly variable positions

...

  1. (these can often be mutated without loss of function

...

  1. and are the ones you should target if you want to change stability).

Homology modeling of OAH niger with its substrate oxaloacetate and the design of an inhibitor

Build a homology model

...

Navigate to the protein information page of G3Y473 and select the

...

MODELS tab.

...

3DM selected three structures as potential good templates.

...

Later in this course

...

you will learn how to select the best template, make the best alignments, etc., but for now

...

we will use

...

3LYEA which has the best resolution (e.g. quality).

Select 3LYEA as a template

...

, use Alignment as numbering (default) and select your desired format. You can open the

...

created file either with

...

YASARA or

...

PyMol.

...

Info

Note that generated models can always be retrieved from the Visualize pages.

In the YASARA or PyMol, select the residue with 3D number 157.

...

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 14

: What is the residue type of 157? In yasara YASARA you can make a residue visible by right-clicking on the residue in the sequence at the bottom of Yasara YASARA and choose " Show Atoms residue" Residue

Expand
titleAnswer

SerineThe residue type of 157 is serine.

Load an inhibitor

Structures can be loaded directly in

...

YASARA from the 3DM database via

...

3DM → Structures →

...

Load structurefrom 3DM

...

. Loading structure files via the 3DM menu ensures that the structures are all superimposed

...

; co-crystallized compounds will

...

be positioned in the active site and proteins will have the 3D numbering.

Load the inhibitor of 1M1BA

...

Build oxaloacetate from this inhibitor.

Image Removed

Fig 2. Structure of oxaloacetate.

...

(select compound, unselect protein). The structure of oxaloacetate is visualised in Figure 4 and 6. We are very lucky since it is very similar to the structure of the 1M1BA inhibitor. Simply swapping the SO3 group with a CO2 group will do the job.

...

In

...

YASARA:

    To do this delete
  • Delete one oxygen

  • of
  • atom from SO3 → select it and press delete.

  • Then, right-click on the S and select

  • "swap → atom" and
  • Swap → Atom to replace it with carbon. The angles are not perfect (it needs energy minimization), but it gives a quick and dirty idea on how oxaloacetate fits in the active site.

In Pymol:

  • Load the 1M1B structure.

  • Zoom in on the ligand

  • ,
  • and find the SO3 group.

  • Ctrl + Middle click on one of the Oxygen atoms on the SO3 group. A number of extra objects appear in the object list on the right.

  • In the command line at the top, enter: remove pk1 and press enter. The oxygen atom will disappear.

  • Ctrl + Middle-click on the S atom in the group.

  • In the command line, enter: alter pk1,elem="C" , then press Enter.

  • In the command line, enter: alter pk1,name="C4" , then press Enter.

  • In the object list, click on the C that appears next to the 1M1B object. Select any of the

  • coloring
  • colouring schemes under Color... By element. The SO3 group will now be

  • colored
  • coloured the same as a CO2 group.

...

The reaction mechanism of isocitrate lyase (ICL) is known for quite a while (

...

Figure 5). In this reaction mechanism the H of the blue OH group donates an electron, makes a double bond, and splits of the COOH group.

...

iconfalse
titleQuestion 17

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 15: Do you think OAH can use the same reaction mechanism to break down oxaloacetate?

Expand
titleAnswer

No, it has a =O instead of an OH.

Image Removed

...

Fig 3. Reaction mechanism of ICL (above) and the structure of oxaloacetate (below)

Actually, oxaloacetate in water is in equilibrium with its diol form (figure 4Figure 7).

Image Removed

Fig 4. Oxaloacetate is in equilibrium with its diol.

...

iconfalse
titleQuestion 18

...

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 18: Do you think this diol of oxaloacetate can be converted with the same reaction mechanism as ICL?

Expand
titleAnswer

Yes, the diol form of OAH has the required OH group.

Until today OAH is the only known enzyme of this superfamily that has a substrate in a diol form.

...

The extra OH is unique to OAH.

title
Info
iconfalse
Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 19

: Where do you think the extra OH will be positioned?

title
Expand
titleAnswer

The OH unique to OAH is sticking right towards ser157.

Info
iconfalse

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 20

: Can you think of a reason why the Ser157 is also unique to OAH?

Expand
titleAnswer

Because oxaloacetate is the only substrate that has this diol form, the Ser serine contacting this extra OH is unique to OAH.

...

Modelling the extra OH in the active site with the

...

swap

...

option does not work very well in

...

YASARA, because

...

YASARA can

...

not deal with changing the double bond of C=O to the single bond of C-OH without proper energy

...

minimisation (try to make the diol with the swap option if you like).

Image Removed

Fig 5. The result of energy minimization performed on the diol form of oxaloacetate in the OAH model.

...

In 2008, a model of OAH was generated similar to the way you did it today. With this model we were already in 2008 able to:

  1. Reveal the OAH specific serine 157

...

  1. . Figure 9 clearly shows the predicted Ser157 H-bridge with the diol of oxaloacetate.

  2. Reveal the reaction mechanism of OAH (via the diol substrate).

  3. Show the relation between oxalate production and pathogenicity of fungi.

  4. Make a very strong inhibitor of OAH (potential anti-fungal drug).

The inhibitor was designed by organic chemists that

...

realised they had to make a compound that is 100% in the diol form. This was the case with difluoro-oxaloaceate. This compound indeed proved to be a very strong inhibitor of OAH and was later

...

crystallised together with OAH of the fungus Cryphonectria Parasitica (

...

PDB file 3M0JA).

To see how well you modeled oxaloacetate

...

Image Removed

Fig 6. Picture of the model of OAH taken from the 2008 publication: Identification of fungal oxaloacetate hydrolyase within the isocitrate lyase/PEP mutase enzyme superfamily using a sequence marker-based method. This picture clearly shows the predicted Ser157 H-bridge with the diol of oxaloacetate.in the active site, load the drug of 3M0JA in your model with the 3DM option of YASARA.

...

Extra questions

Position 157 is the center of the correlated mutation network. P (proline) is the most common residue at position 157 (is that correct?). We have generated a subset of sequences that have a P at position 157 called "P157"

...

.

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 21

: Do you think position 157 will show a high correlated mutation score in this subset?

Expand
titleAnswer

No, in . In a subset with only P on position 157 the P is, of course, 100% conserved and therefore can 't not mutate together with other positions. The correlated mutation data that 3DM calculates is a measure of how often mutations occur together between two positions. No mutations will result in a score of 0.

title
Info
iconfalse
Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 22

: Using Yasara YASARA or PymolPyMol, investigate the new P157 correlated mutation network. Can you find the role of the amino acids in this new network (what What is the function behind this network)?

#C8E6C9

domain.

image-20241024-115838.pngImage Added
Expand
titleAnswer

In

the figure,

Figure 10 you can see where the correlated mutation is found in the P157 alignment.

This plot was generated by clicking "visualize all notes" in CorNet when the P157 subset was selected.

Image Removed

From within Yasara you can also make this visualization You can generate such a plot by clicking ALL NODES on the Correlated mutations page with the P157 selected. Alternatively, from within YASARA you make this visualisation by the following functions:

  1. 3DM →

select
  1. Dataset → Select a subset → P157

  2. 3DM →

show
  1. Show superfamily data → Correlations

image-20241024-115307.pngImage Added

Now to answer the question, do the following:

  1. 3DM →

select dataset → full
  1. Dataset → Select a subset → Full dataset

  2. 3DM →

show
  1. Show superfamily data → Correlations

You can switch between the two scenes using the different tabs. Clearly, the two scenes overlap.

So the

The correlated mutations in the P157 alignment are formed during evolution in the dimerization

domain.
Panel
borderColor#C8E6C9
bgColor
Info

Note that the input alignment is a very important factor in what protein feature is behind the correlated mutation data. Many different protein features can be the evolutionary pressure resulting in correlated mutations (e.g. activity, specificity, binding to something else, enantioselectivity

, and many others

). Often

the literature can be used to find which feature this is. This is why the enrichment score was designed.
Panel
borderColor#C8E6C9
bgColor#C8E6C9

Take home message: The function underlying correlated mutations heavily depend on the input alignment. Always look for additional data (in this case protein-protein interaction data → did you find that?) that might explain a correlated mutation network, the literature can be used to find which feature this is. That is why the enrichment score was designed.

The correlated mutations in this superfamily seem to reflect positions that are important for specificity.

...

Imagine you want to change the specificity of OAH and you decide to rationally design a mutant library. Your screening method allows you to screen up to 1000 mutant clones.

...

Panel
panelIconIdatlassian-question_mark
panelIcon:question_mark:
panelIconText:question_mark:
bgColor#E6FCFF

Question 23

: How would you design your library? (Just give Give a general description of which residue positions you would choose, why you choose those and which residues you would try at those positions).

Expand
titleAnswer

This question can be answered thoroughly or just very simple. One thing is for sure, everything shows that the correlated mutations are important for specificity. Those are your first-choice hotspots. Then, pick as many positions as your screen allows, only using common residues at these hotspots (e.g. residues > 2% or so). This cut-off percentage is depending on how many hotspots you want to use. The more hotpots, the higher this percentage or your library size gets too big. So it is always a trade-off between the number of hotspots and the number of residues per hotspot. There are many things to consider when you choose hotspots and the residues at the hotspots. Each position should be considered carefully and all data at each position should be investigated.

These are some things to consider:

  • Read the articles that describe mutations at the hotspots. They might have a role that you

don't
  • do not want to target. The template sequence might also have things specific for the template that should not be touched.

  • For the same reasons, look at the structure.

Am I
  • Are you not destroying a salt-bridge unique to the template

.
  • ? If so, you probably need to mutate the other partner of the salt-bridge as well. Although

I would probably not
  • it might be better not to touch the bridge and choose another hotspot if this position is not

THE
  • the hotspot.

  • Look at which proteins have the residue that

I
  • you want to introduce. So for each position you look

for each
  • per residue what type of protein has this amino acid. When choosing the residues to which you want to mutate, you sometimes

might
  • want to exclude residues that are just above your cut-off. You might want to exclude hydrophilic residues, for instance, if those are only present in enzymes that do an exotic reaction. Try to downsize your library this way, by excluding residues per position.

  • Check all data types at each hotspot that you want to target to see if it

doesn't
  • does not have a role (e.g. dimerization) that might harm the activity.

  • Always use your brain to find the best combination of hotspots and amino acids. It is a tricky business and you will get better by practice.

  • Make the library bigger than you can screen. If correctly designed, your library will contain more than one hit and you

don't
  • do not need to find all hits. Even if it does contain just one hit, you also

don't
  • do not need to screen 98% of your library to find the hit. Screening 70% should give you enough confidence that if no improvement is detected you are better

of
  • off designing a new library.