Index

1 General
2 Introduction
3 Subsets
4 Homology modeling of OAH niger with its substrate oxaloacetate and the design of an inhibitor
- 4.1 Build a homology model
- 4.2 Load an inhibitor
5 Extra questions

General

For this exercise, you need either PyMol or YASARA installed that has the 3DM plugin. If you don't have Yasara or Pymol or you are missing the 3DM functionality, please consult the installation instructions. Before you start this exercise, make sure you have the latest version of Yasara or Pymol installed.

Login at 3DM with your 3DM account. If you don't have a 3DM account you can request one via the "get 3DM" tab.

After entering the login details on the login page, you land on the 3DM dashboard page. Here, you can find an overview of all available systems. In the 3DM COURSE tab, click on the Phosphoenolpyruvate Mutase / Isocitrate Lyase system. This is the 3DM system we will be working with during this course.

In case you have any questions about this course, please get in touch with our support team via support@bio-prodict.com.

Introduction

Fungi can be pathogenic to plants and animals. It is known that the secretion of oxalate by fungi is a commonly used strategy for their pathogenicity. Oxalate is toxic and can form crystals that demolish the cell wall of the host. The oxalate is produced from oxaloacetate, catalyzed by the enzyme oxaloacetate hydrolase (OAH). This is the reaction:

Figure 1. Reaction mechanism that produces oxalate.

We have generated a 3DM for the corresponding protein family. OAH falls in the Phosphoenolpyruvate mutase/Isocitrate lyase superfamily. The OAH of niger is the best characterized OAH protein. This is the sequence:

>G3Y473
			MKVDTPDSASTISMTNTITITVEQDGIYEINGARQEPVVNLNMVTGASKLRKQLRETNEL
			LVCPGVYDGLSARIAINLGFKGMYMTGAGTTASRLGMADLGLAHIYDMKTNAEMIANLDP
			YGPPLIADMDTGYGGPLMVARSVQQYIQAGVAGFHIEDQIQNKRCGHLAGKRVVTMDEYL
			TRIRAAKLTKDRLRSDIVLIARTDALQQHGYDECIRRLKAARDLGADVGLLEGFTSKEMA
			RRCVQDLAPWPLLLNMVENGAGPVISVDEAREMGFRIMIFSFACITPAYMGITAALERLK
			KDGVVGLPEGMGPKKLFEVCGLMDSVRVDTEAGGDGFANGV

For each protein in the 3DM database, there is a protein information page that contains more detailed information.

Question 1: Find the protein information page of the sequence above using the search option of 3DM. What is the core identity of this protein?

On the protein information pages, you can find a couple of different tabs. Have a quick look at what you can find in each tab.

Subsets

3DM offers several ways to select a subset of sequences. Once a subset is selected a mini 3DM can be generated for this subset. As we have demonstrated in Introduction: 3DM applied to the nuclear receptors, all 3DM functionalities, such as the correlated mutations analysis, are regenerated and can separately be inspected. The data of a subset can also be compared to the data of the full set of sequences or with other previously defined subsets.

With the search option you can create a subset that contains the proteins that are available in this 3DM system for fungi of which it is known that they can produce oxalate. We have created such a subset and called it “oxalate producers”. You can find it under Subsets inside the 3DM System. The subset contains 33 proteins from the following species:

				Aspergillus clavatus
				Neosartorya fischeri
				Penicillium chrysogenum
				Penicillium marneffei
				Talaromyces stipitatus
				Sclerotinia sclerotiorum
				Aspergillus niger
				Sclerotium cepivorum
				Aspergillus terreus
				Aspergillus fumigatus
				Botryotinia fuckeliana

Question 2: How would you create such a subset, using the 3DM System?

Question 3: What do you think we can learn from this subset?

Navigate to the Alignment statistics page. Then, change the subset at the top of the page from Full dataset to oxalate producers and notice how the graphs change (Figure 2).

Figure 2. Change the subset to “oxalate producers”.

3DM always generates an extra histogram for each subset that shows which residues are specifically conserved in the selected subset (the histogram called Subset specific conserved residues). The highest scoring residues in our subset are around 3D positions 157.

It is important to realise that these are positions that are not just conserved in this subset of oxalate producing fungi, the corresponding residues are also absent from the rest of the sequences in the superfamily. In other words, these residues are specific for this subset.

You can see this by comparing this plot with the Amino acid conservation plot of the new subset. In the CUSTOM PLOTS tab, select the oxalate producers subset in the left box and in the right box select Amino acid conservation and Subset specific conserved residues (Figure 3). Then, click Generate.

Figure 3. Custom plot generation.

Question 4: How many positions are 100% conserved? How many are specific for the oxalate-producing fungi?

Navigate to the Alignment page and click on the consensus sequence at position 157.

Question 5: What is the most conserved residue at this position in the oxalate producers subset?

Question 6: What is the percentage of this residue in the full alignment?

Question 7: What is the difference? Do you understand the Subset specific conserved plot from the previous question?

The data you are looking at is always depending on the subset tab that is selected.

Go the Correlated mutations page via the menu on the left. Make sure you have Full Dataset selected as Subset at the top of your 3DM page.

Correlated mutations calculated for a superfamily alignment often reflect positions that are important for specificity, because superfamily alignments contain enzymes with different specificities.

Question 8: Explain this concept.

Now open the CORRELATION HEATMAP tab. The heatmap shows the alignment positions of which the residues mutate simultaneously (correlated mutation).

Question 9: Which position is the highest correlating?

Go back to the CORRELATION NETWORKS tab. Enter the keyword “specificity” in the Literature & Mutations search box. This will select mutations from the literature that affect specificity, reported in any of the proteins of the superfamily.

Question 10: Which residue positions are reported to affect specificity and which one is the most published position in relation to specificity?

Go back to the Alignment statistics page. Now compare the ligand contact plot (in this case these will be enzyme inhibitors) of the full dataset with the correlation plot of the full dataset.

Question 11: If you would add these plots, what would be the highest scoring position? What does this mean?

When you find a correlation between data types like above, your alarm bells should start ringing. Always keep your biological question in mind when you are doing research, and then think about how 3DM can help you answer that question. For example:

Can I simply use the full database, or should I create any subsets?
What data do I need to compare?

In that process, try to make efficient use of you knowledge about:

Conserved residues (they perform the general function of proteins).
Correlated mutations (they perform the specific function of the proteins).
Highly variable positions (these can often be mutated without loss of function and are the ones you should target if you want to change stability).

Homology modeling of OAH niger with its substrate oxaloacetate and the design of an inhibitor

Build a homology model

Navigate to the protein information page of G3Y473 and select the MODELS tab. 3DM selected three structures as potential good templates. Later in this course you will learn how to select the best template, make the best alignments, etc., but for now we will use 3LYEA which has the best resolution (e.g. quality).

Select 3LYEA as a template, use Alignment as numbering (default) and select your desired format. You can open the created file either with YASARA or PyMol.

Note that generated models can always be retrieved from the Visualize pages.

In the YASARA or PyMol, select the residue with 3D number 157.

Question 14: What is the residue type of 157? In YASARA you can make a residue visible by right-clicking on the residue in the sequence at the bottom of YASARA and choose Show → Atoms → Residue

Load an inhibitor

Structures can be loaded directly in YASARA from the 3DM database via 3DM → Structures → Load structure from 3DM. Loading structure files via the 3DM menu ensures that the structures are all superimposed; co-crystallized compounds will be positioned in the active site and proteins will have the 3D numbering.

Load the inhibitor of 1M1BA (select compound, unselect protein). The structure of oxaloacetate is visualised in Figure 4 and 6. We are very lucky since it is very similar to the structure of the 1M1BA inhibitor. Simply swapping the SO3 group with a CO2 group will do the job.

Figure 4. Structure of oxaloacetate.

In YASARA:

Delete one oxygen atom from SO3 → select it and press delete.
Then, right-click on the S and select Swap → Atom to replace it with carbon. The angles are not perfect (it needs energy minimization), but it gives a quick and dirty idea on how oxaloacetate fits in the active site.

In Pymol:

Load the 1M1B structure.
Zoom in on the ligand and find the SO3 group.
Ctrl + Middle click on one of the Oxygen atoms on the SO3 group. A number of extra objects appear in the object list on the right.
In the command line at the top, enter: remove pk1 and press enter. The oxygen atom will disappear.
Ctrl + Middle-click on the S atom in the group.
In the command line, enter: alter pk1,elem="C" , then press Enter.
In the command line, enter: alter pk1,name="C4" , then press Enter.
In the object list, click on the C that appears next to the 1M1B object. Select any of the colouring schemes under Color... By element. The SO3 group will now be coloured the same as a CO2 group.

Figure 5. Reaction mechanism of ICL.

Figure 6. structure of oxaloacetate

The reaction mechanism of isocitrate lyase (ICL) is known for quite a while (Figure 5). In this reaction mechanism the H of the blue OH group donates an electron, makes a double bond, and splits of the COOH group.

Question 15: Do you think OAH can use the same reaction mechanism to break down oxaloacetate?

Actually, oxaloacetate in water is in equilibrium with its diol form (Figure 7).

Figure 7. Oxaloacetate is in equilibrium with its diol.

Question 18: Do you think this diol of oxaloacetate can be converted with the same reaction mechanism as ICL?

Until today OAH is the only known enzyme of this superfamily that has a substrate in a diol form. The extra OH is unique to OAH.

Question 19: Where do you think the extra OH will be positioned?

Question 20: Can you think of a reason why the Ser157 is also unique to OAH?

Modelling the extra OH in the active site with the swap option does not work very well in YASARA, because YASARA can not deal with changing the double bond of C=O to the single bond of C-OH without proper energy minimisation (try to make the diol with the swap option if you like).

Figure 8. The result of energy minimisation performed on the diol form of oxaloacetate in the OAH model.

In 2008, a model of OAH was generated similar to the way you did it today. With this model we were already in 2008 able to:

Reveal the OAH specific serine 157. Figure 9 clearly shows the predicted Ser157 H-bridge with the diol of oxaloacetate.
Reveal the reaction mechanism of OAH (via the diol substrate).
Show the relation between oxalate production and pathogenicity of fungi.
Make a very strong inhibitor of OAH (potential anti-fungal drug).

The inhibitor was designed by organic chemists that realised they had to make a compound that is 100% in the diol form. This was the case with difluoro-oxaloaceate. This compound indeed proved to be a very strong inhibitor of OAH and was later crystallised together with OAH of the fungus Cryphonectria Parasitica (PDB file 3M0JA).

To see how well you modeled oxaloacetate in the active site, load the drug of 3M0JA in your model with the 3DM option of YASARA.

Figure 9. Picture of the model of OAH taken from the 2008 publication: Identification of fungal oxaloacetate hydrolyase within the isocitrate lyase/PEP mutase enzyme superfamily using a sequence marker-based method.

Extra questions

Position 157 is the center of the correlated mutation network. P (proline) is the most common residue at position 157. We have generated a subset of sequences that have a P at position 157 called "P157".

Question 21: Do you think position 157 will show a high correlated mutation score in this subset?

Question 22: Using YASARA or PyMol, investigate the P157 correlated mutation network. What is the function behind this network?

Note that the input alignment is a very important factor in what protein feature is behind the correlated mutation data. Many different protein features can be the evolutionary pressure resulting in correlated mutations (e.g. activity, specificity, binding to something else, enantioselectivity). Often, the literature can be used to find which feature this is. That is why the enrichment score was designed.

The correlated mutations in this superfamily seem to reflect positions that are important for specificity. Imagine you want to change the specificity of OAH and you decide to rationally design a mutant library. Your screening method allows you to screen up to 1000 mutant clones.

Question 23: How would you design your library? Give a general description of which residue positions you would choose, why you choose those and which residues you would try at those positions.

Bio-Prodict Docs

OAH & 3DM: Engineering insights