Index

1 Index
2 General
3 A: 3D numbers make life easy
4 B: Easy visualization of data in structure files
5 C: Search options and subset generation.
- 5.1 Creation of subsets
- 5.2 Analyzing subsets
6 D: Designing drugs with 3DM

General

The complete course consists of three parts:

To follow these courses, you need a Bio-Prodict account. If you do not have one yet, please start by filling out the form on the registration page. Are you eligible for an academic license? Make sure to register with your academic email address. Once you submit the registration form, you will receive a confirmation email with further instructions on how to activate your account.

For some exercises in this section, you need either PyMOL or YASARA installed which has the 3DM plugin. If you do not have YASARA or PyMOL or you are missing the 3DM functionality, please consult the https://bioprodict.atlassian.net/wiki/spaces/DOC/pages/8323108. Make sure you have the latest version installed before you start this exercise.

In case you would like to have a look at the capabilities of 3DM before you start this course, please have a look at the https://bioprodict.atlassian.net/wiki/spaces/DOC/pages/275808265.

In case you have any questions about this course, please get in touch with our support team via support@bio-prodict.com.

A: 3D numbers make life easy

In this section, we explain the advantages of the 3D number system that is used in 3DM. In addition, we guide you through the layout of the systems and show you some handy features they provide. Some small exercises that are provided in this course will help you learn how to navigate through 3DM.

After entering the login details on the login page, you land on the 3DM dashboard page. Here, you can find an overview of all available systems. In the 3DM COURSE tab, click on the Nuclear Receptors Ligand Binding Domain 2023 system (Figure 1). This is the 3DM system we will be working with during this course.

Figure 1. Dashboard 3DM and available systems.

The icons in the 3DM data circle that is displayed on the start page of a 3DM system contain links to the most important 3DM tools: Alignment, Correlated mutations, Phylogeny, Search, Visualize, and Alignment statistics. These tools are also available on the left side panel.

Let’s say you found a paper that reports the mutation T350M of the mouse constitutive androstane receptor (CAR) to affect specificity. Your own research is about the human CAR receptor. You are wondering if the mutation as described in the paper would have the same effect in your protein. The first step would of course be a literature search

Question 1: How would you search the literature for mutations in the human androgen receptor at the same position as T350 in the mouse constitutive androstane receptor?

Different search options are available within 3DM to create a subset of sequences. With the Search tool (available in the menu on the left side of the page), you can search with different modules such as by keyword, BLAST, structures, or compounds. A subset of all the sequences selected by the search options can be saved for later use. Section C described the search options and subset generation in more detail.

In Search proteins by keyword you can search on multiple keywords simultaneously by clicking on the green plus button. By selecting the accession number, you enter the protein detail page. Further information and numbering can be found under the PROTEIN ANALYSIS tab of this page.

Question 2: Use Search proteins by keyword to find the mouse CAR in 3DM. What is the 3D number of the T350 residue?

Note the overall structure of 3DM: The menu on the left is available on all 3DM pages and lists the 3DM tools. The highlighted option tells you what type of data is currently displayed.

Finding articles describing structural equivalent residues in homologous sequences is quite easy using 3DM. Open the alignment page via the menu on the left side. In the upper right corner you can find several options to customize the visualization of the alignment. The sequences displayed in the consensus alignment view give a nice quick overview of what trends evolutionary pressures have left behind.

Click on alignment position 190 in the overall consensus, which is the sequence on top of the alignment (Figure 4). In the MUTATIONS tab you see a list of mutations that were retrieved from the literature.

Fig 4: Alignment overview and 3DM numbers.

Question 3: How many articles describe mutations in T350 of the mouse CAR? Tip: you can use the filter option in the MUTATIONS tab to search through all text fields in the tables.

Question 4: Still focussing on alignment position 190, can you find any papers that describe a mutation in human receptors?

Question 5: What residue type has the human CAR isoform 4, and what is its residue number? Would it have been easy to find it without 3DM?

Many papers describe mutations in the human CAR, but we still have not found a paper reporting a mutation that affects specificity. You might just read all these papers, but there is an easier way to do this using YASARA or PyMOL. How to approach this is explained in the next section.

B: Easy visualization of data in structure files

3DM offers many different ways to visualize different data types in any of the available structures using either YASARA or PyMOL. In this second section, we will look into the visualization options of different structures and their characteristics.

For this part, you will need either PyMOL or YASARA installed which has the 3DM plugin. If you do not have either of these or in case you are missing the 3DM functionality, please consult the installation instructions. Make sure you have the latest version installed before you start this exercise.

Go to the Visualize page via the menu on the left. In the first Structures section the template structure 2OCFA is already selected because it is the system’s default template. You can change this behavior by clicking the settings icon in the top right (Figure 7).
Since we will not use the template structure in this demonstration, click on the clear selection button () to clear the selected structures (Figure 7).
Click on the Add structure button. A list of structures will appear below (Figure 7).

Figure 7. The visualize menu.
Select two templates from the structures list (1G2NA and 1HG4A) by clicking the on the left side of the table. These two structures are part of a list of templates that were used to build the sequence alignments of the subfamilies. 1G2NA and 1HG4A should now appear in the Structures section that you cleared in the previous step.
You can use the Quick filter menu to quickly change between the different filtering options. Select the Show all option to make all structures appear in the table (Figure 8).

Figure 8. Selecting structures to visualize.
The first structure in the list is now 1A28A. The Compounds column shows the compound that is associated with this structure. Use the Select compounds button () to add the ligand compound Progesterone. Progesterone should now appear in the in the Compounds section at the top (Figure 9).

Figure 9. Selecting compound to visualize.
ADD POSITIONS in the Positions section provides you the option to select different data types to visualize in the selected structures such as correlated mutations or conservation. Click on ADD POSITIONS and select the CORRELATED MUTATIONS tab. Click the checkbox on top of the table to select all the correlated mutations. 20 correlated positions have now been added to the Positions section.
Go to the CONSERVATION tab and again click the top checkbox to select all positions in the table. 15 conserved positions are now also added to the Positions section.

If you followed the steps correctly your selection should now look like what is displayed in Figure 10.

Figure 10. Final selection to visualize.

You can choose between the visualization programs YASARA or PyMOL with the option above the visualize button. Your visualization program preference will be saved for future use.

Click the "Visualize" button to download the scene. It might take a few seconds to create your download. Save the file and open it with YASARA or PyMOL.

Note that the first time opening such a scene, you might need to manually select YASARA of PyMOL to open the file since your computer does not know yet that .sce files of .pse files are scene files. You can tell your computer to always use YASARA to open .sce files or PyMOL to open .pse files.

In YASARA the 3DM module is usually active immediately. If not, you will need to install it. In that case, go to Help → Install → 3DM to install the module.
If you are a PyMOL user you first need to start the plugin. To start the 3DM plugin use Plugin → Initialize plugin system. Then Plugin → Legacy plugins → 3DM. A new window should open. Use your 3DM credentials to log in. At the left top of this window, the 3DM menu should now appear.

The scene will show two superimposed protein structures and one ligand. All protein structures and co-crystallized compounds are superimposed in 3DM. Therefore you can insert any of the co-crystallized compounds in any of the protein structures and easily compare any structural differences between them.

Question 6: The yellow residues are the conserved residues. Why would there be no yellow residues surrounding the ligand?

Question 7:
In YASARA you can right-click on conserved residues and select 3DM → Show amino acid distribution to find out its conservation percentage. In PyMOL you can select a residue and in the 3DM plugin you can select Show amino acid distribution. Which residue (3DM number 47, 54, or 55) is the most conserved?

Question 8: Are the correlated mutations (purple), or the conserved residues (yellow) more likely to be involved in ligand specificity?

In large superfamily alignments, correlated mutations (co-evolution of residues) are almost always functionally related. For example, they can influence enzyme activity, enantioselectivity, or co-factor binding. However, correlated mutations are mostly related to changes in specificity. The residues at positions where these mutations occur are important for the functionality of the protein. This creates an evolutionary pressure which, in turn, results in restricted mutation rates.

The take-home message for protein engineers: correlated mutations can often be found surrounding the ligand/substrate pocket. When they do, they are often correlated with specificity changes and are therefore specificity hotspots. If you want to change specificity, you can make a mutant library at these positions. If your library becomes too large, select only the residues that are common in the alignment.

The 3DM menu within YASARA and PyMOL contains options to visualize different data types in protein structures. The Literature Hotspots option is a really powerful tool that we would like to point out.

Select Literature Hotspots in the 3DM menu and click Specificity (Figure 11). At this point, you might need to log in with your 3DM login details. The top 20 residue positions that have been reported in the literature to affect specificity, including the number of mutations, are now highlighted in your scene. Note that a new tab is opened in YASARA for this. By clicking on the MAIN tab, you can go back to your default view.

Figure 11. Highlighting literature hotspots in YASARA.

Question 9: How many mutations have been reported in the literature to affect specificity at position 183? And how many mutations at position 190, looked at in section A?

One article that contains mutation data related to specificity changes at position 183 is: "Broadened ligand responsiveness of androgen receptor mutants obtained by random amino acid substitution of H874 and mutation hot spot T877 in prostate cancer". The abstract of this paper is available here. In the first sentence of the abstract you can read that the mutation indeed affects specificity. The title reveals that T877 is a prostate cancer hotspot.

Question 10: Imagine you wonder if there are mutations at other positions known to cause prostate cancer. Go again to the Literature hotspots option in the 3DM menu and select Custom Keywords. Use the keyword "prostate cancer". Is position 183 (877 in the sequence) indeed a hotspot for prostate cancer? Are there other positions that are linked to prostate cancer?

Note: In YASARA you can see the number of mutations that are found by 3DM directly in the heads-up display. In PyMOL you need to select 3DM → Show scene content to get the list of mutations. Make sure you have the correct scene selected in the structure window, called Literature hotspots prostate cancer.

Find position 183 in the scene to determine if it makes contact with the ligand. The best way to find out if position 183 is also a ligand binding hotspot would be to open all 789 available structure files and count the number of contacts this position makes with co-crystalized ligands. Within YASARA and PyMOL we can do this a lot easier.

In the 3DM menu, select Show super-family data and click on Ligand contacts. A small window will pop up where we can specify the positions and minimum amount of contacts. Keep the default settings for now, and click OK.

Figure 12: Ligand contacts in YASARA

In YASARA the HUD on the right displays a list that shows the selected data from 3DM. In PyMOL you first need to select the structure in which the program should show the ligand contacts. Select 1G2NA_prot, then again the use the 3DM → Show scene content details option to get the list of ligand contacts.

Question 11: Is position 183 a ligand-binding hotspot?

Nuclear receptors can either be activated or inhibited by small molecules which are called ligands in 3DM. Activating compounds are called agonists and inhibiting compounds are called antagonists. Say you would like to know where activating compounds bind, where inhibiting compounds bind, and if there is a difference. Normally this would take up quite some time, but within 3DM we can easily create subsets to visualize the differences between subsets. This is described in section C of this course.

C: Search options and subset generation.

Creation of subsets

Different search options are available in 3DM to create subsets of sequences. In the Search menu you can search with different modules such as Proteins by keyword, Proteins by BLAST, Structures, or Compounds. Subsets of the sequences that are selected from the results can be created as a follow-up step. All the 3DM options and scene visualizations are available for the smaller subsets. It is even possible to create a small 3DM systems of your selected subset. We will explain you how to this works.

To investigate the different binding mechanisms between inhibiting (antagonist) and activating (agonist) ligands of nuclear receptors, we will create two separate subsets: one subset containing proteins with agonists in the binding pocket, and one subset containing proteins with antagonists in the binding pocket. The differences between the binding modes can be revealed by comparing the ligand binding positions:

In the Search menu in 3DM select Structures.
Enter the keyword "antagonist" in the search bar and click SEARCH.
Click on the Subset button in the top-right corner of the page. A yellow Edit subset field will now appear next to the search results (Figure 13).

Figure 13. Create subset from structure search results.

Click on Create new subset.
Select all 235 proteins in the table by clicking on the upper checkbox that appeared in the table. It is possible to manually select some of the proteins by selecting the selection box in front of the PDB identifier.
Click on Add 235. The 235 selected protein structures will now be in the subset menu on the right site.
Some proteins bind both antagonists as well as agonists. For the purpose of this exercise, we want to remove these proteins from the subset. Search for “agonist” in the search box at the top of the page and select the Match whole word option on the right side of the search box. This will make sure that antagonists are excluded from the search results.
Again, select all proteins from the result.
This time, click the remove button to remove the proteins that can bind both agonists and antagonists from the subset
The subset now has 181 proteins left. Give this subset the name "Course inhibitors" and click Create subset. A small 3DM system will now be generated for this subset. If you would only want to save the subset without, deselect the Generate option.
Next, we want to create a subset with the activation ligands. Again, click on Create new subset in the top-right corner of the subset menu.
! Deselect Match whole word option.
Search for the keyword "agonist".
Select all 1,149 proteins in the table and add them to your subset.
Again we need to remove possible proteins that contain both agonists and antagonists. Select the Match whole word option again and search for the keyword "antagonist".
Select the proteins in the table and click on the remove button. This will leave 963 proteins in your new subset.
Give the new subset the name "Course activators" and click Create subset.

Please note that in this example we have many structures available and therefore this trick works. In real life, you would most likely manually create your subset by selecting proteins from search results by hand.

From the dropdown menu in the subset window on the top right of the page, choose the Course activators subset. Now, all data used for visualizations will only include the sequences that are part of this subset.

Analyzing subsets

To see the effects of subsets, first go to the Alignment statistics via the left side menu. Different data concerning the 3D numbers are visualized in separate histograms. Over 20 histograms available that you can find by scrolling down the page. Switch between subsets using the middle menu item at the top of 3DM, just underneath the name of the system (Figure 14). Using this selection menu you can select the Course activators and Course inhibitors subsets that you have just created. Note that when you switch between datasets, the data in the plots changes accordingly.

Figure 14. Select a subset to use for data analysis.

Select the CUSTOM PLOTS tab underneath the alignment statistics header. Here, you can compare your own subsets based on different data types, e.g. on correlated mutations. In the Subsets box on the left, select both "course activators” and "course inhibitors". In the Data types box on the right, select "ligand contacts". Then, click the "Generate" button. A histogram is now created that displays the ligand contacts for both subsets. This allows you to easily compare the ligand contact points of activators and inhibitors.

Question 12: Does this view always provide a fair comparison between both datasets?

You can use the visualization option of 3DM as explained in section B to get a detailed look at the contact positions of the protein structures:

Go to the "Visualize" option in the menu on the left.
Switch to “Full dataset” again in the Subset selection menu at the top of the page and make sure you have the Quick filter set to “Show all”.
Select structure "1BSXA" and its compound "Liothyronine".
Click VISUALIZE.
Open the scene in YASARA or PyMOL.

If something is unclear, please revisit section B.

Residues that are not part of a core region, and therefore located in a variable region, have a "V" behind their 3D number.

In the scene, locate the core 1BSXA residues with 3D numbers 31, 34, and 192. Have a look at their side chains and contacts. Make sure to select the residues of the core regions, not of the variable regions.

In YASARA you can right-click on residues and select show → residue either from the structure itself or from the residue bar at the bottom of YASARA. It is also possible to color the residues via the color option.
In PyMOL:
- Click on a residue inside the structure (or inside the residue bar at the top) to highlight it.
- Click on the command bar on the bottom (indicated by `PyMol>`).
- To show the selected residue, type: "show sticks, sele" and press Enter.
- Or, to color the selected residue type: "color red, sele" and press Enter.

Question 13: Are the residues in contact with each other? Do they also make contact with the ligand?

In YASARA or PyMOL you can load a structure and ligand from 3DM. Load the inhibited protein 1ERRA in the scene by selecting the 3DM option in the menubar on top of the window, followed by Structures and Load structure from 3DM. A second protein and ligand will now be placed in the scene.

Question 14: Is this second ligand contacting the colored residues 31 and 34?
How does this ligand relate to the residue in the helix at position 192?

Now, load the inhibited protein 1G2NA in the scene.

Question 15: Look at the whole helix of residue 192 in 1G2NA (this helix is called helix 12). With all the information provided until now, formulate an explanation of how inhibitors work in nuclear receptors.

D: Designing drugs with 3DM

Inside 3DM, navigate to Search → Structures from the side menu. Find an androgen receptor structure with its natural ligand dihydrotestosterone bound in the ligand-binding pocket.

Question 16: Which protein(s) did you find with these characteristics? Of which species is that result?

Try to visualize both this protein chain and the ligand. Go to the details page of the hit (1I37A) by clicking on the protein ID. This link will open the protein detail page of 1I37A in a new tab. You can use the visualize icon next to the PDB name and chain to navigate directly to the visualize module of 3DM where this PDB is already preselected. Click on VISUALIZE and the structure will download.

Note that on some computers you need to close YASARA first before opening a new one.

Once you have this structure loaded in YASARA or PyMOL, use the 3DM → Literature hotspot → Specificity option to show the hotspots for specificity.

Question 17: Do you think it is likely that changes in specificity might be the underlying cause for the development of cancer in patients with the T877A mutation?

With 3DM you try to find correlations between different data types, and from these correlations you try to learn the underlying biological meaning.

It is likely that the mutant of the androgen receptor T877A, which influences specificity, is an underlying cause of prostate cancer. It is also likely that this mutation causes allows for activation by a different nuclear receptor ligand in humans that is similar to dihydrotestosterone (DHT). Progesterone is an example of such a similar ligand. With this information, we can set the following hypothesis:

“The mutation T877A of the human androgen receptor changes the specificity of the receptor, creating the possibility to be activated by progesterone, and resulting in the overactivity of the receptor causing cancer.”

Let's try to find further support for this hypothesis with 3DM.

Within YASARA or PyMOL, use the 3DM → Structures option to load the ligand of structure 1A28A (progesterone) and compare this ligand with DHT. It is possible that dihydrotestosterone (DHT) is already loaded in YASARA. If this is not the case, load the ligand of 1I37A. Change the ligands to stick visualization for an easier comparison of the two ligands.

Question 18: The superposition of the progesterone on the DHT ligand is not perfect, but can you see how these compounds chemically differ?

Question 19: Is this difference located close to position 183?

Question 20: Do you think the previously set hypothesis could be correct?

Question 21: If the two ligands are perfectly superimposed, you can see that oxygen at the end of the progesterone is likely to bump with the threonine. Would it fit if we mutated the threonine to an alanine?

Finally, let's sum up our findings and formulate a conclusion on the hypothesis.

After this small experiment, it seems very likely that progesterone is capable of activating the T877A androgen receptor mutant. Most drugs that are used to treat prostate cancer are general androgen receptor inhibitors, but are not specific or effective enough for treating prostate cancer caused by the T877A mutation. It would therefore be helpful to have an alternative drug that specifically targets the T877A androgen receptor.

A compound should therefore be designed that could compete with the binding of progesterone in the T877A mutant, but does not activate this mutated androgen receptor. We would have to make a compound that is similar to progesterone (since we now know that progesterone can bind to this mutant), but that has a larger group around position 183. Luckily, using simple chemistry, it is easy to add groups at the ester group (O=C-C-R) of progesterone that is near position 183.

If this idea succeeds, we have designed a drug for treating prostate cancer in patients with the T877A mutation. Would that not be great? Unfortunately, we are not the first to make this type of compound for treating prostate cancer. Several progesterone derivatives have been published as androgen T877A inhibitors. Nonetheless, the above process clearly shows how combining data from a 3DM system can be used as guidance in the early phases of drug design.

Do you want to learn more about how to use 3DM? Please continue with the courses https://bioprodict.atlassian.net/wiki/spaces/DOC/pages/32974 and https://bioprodict.atlassian.net/wiki/spaces/DOC/pages/8257537!

Bio-Prodict Docs

Introduction: 3DM applied to the nuclear receptors