Introduction: 3DM applied to the nuclear receptors

General

The complete course exists out of three parts, the introduction, OAH and 3DM, and the advanced course 3DM applied to GPCRs. 

To start this introduction make sure you have a 3DM account. To be able to do this course you need at least a course login. Use this to log in at app3dm.bio-prodict.nl. If you don't have a 3DM account you can request one via the "Sign-up" button under the "get 3DM" section on the site. After you have requested an account you can request a course login by sending an email to joosten@bio-prodict.nl.

For some exercises in this section, you'll need either PyMOL or YASARA installed which has the 3DM plugin. If you don't have YASARA or PyMOL or you are missing the 3DM functionality, please consult the installation instructions. Make sure you have the latest version installed before you start this exercise.

Index

A: 3D numbers make life easy

In this section, we will explain the advantages of the number system used in 3DM. In addition, we guide you through the layout of the 3DM website and show you some handy features the website provides. Between instructions and information, there will be small exercises to learn how to navigate 3DM. 

After entering the login details on the 3DM page you will see an overview page of all the systems available. Underneath "Custom" you'll see the “Nuclear Receptors Ligand Binding Domain 2023” system. Select this system, as this is the 3DM system we will be working with. 

Fig 1: Dashboard 3DM and available systems.

At the starting page of each 3DM database, you will see the 3DM data circle. The icons in the circle are links to the most important 3DM tools: alignment, correlated mutations, phylogeny, search, visualize, and alignment statistics. These tools are also available on the left side panel.

Let’s say you found a paper that reports the mutation (T350M) of the mouse constitutive androstane receptor (CAR) to affect specificity. Your research is actually about the human receptor. You are wondering if a mutation at the same position in your protein would have the same effect. The first step would of course be a literature search.

 

Question: Describe (no need to do it) how you would search the literature for mutations in the human androgen receptor at the same position as T350 in the mouse constitutive androstane receptor.

This is normally very difficult to do. Due to gaps and insertion homologous proteins will have different numberings. This means that in other homologous sequences, T350 (or better: the structural equivalent residue of T350) often has a different number and can also be a different residue type. Therefore, it is very difficult to know that, for instance, E322 is the structural equivalent of T350 of the Mouse Constitutive androstane receptor.

There are different search options available within 3DM to create a subset of sequences. Under the "Search" menu in the left sidebar, we can search with different modules such as Search by keyword, BLAST, structures, or compounds. We can save a subset of all the sequences selected by the search options. We’ll dive further into the search options in section C. Different search options and subset generation.

Question: Use the “search by keyword” option to find the mouse Constitutive androstane receptor in 3DM and find out the 3D number of this residue.
Tip - you can search on multiple keywords by clicking on the green plus button. It is possible te receive more detailed information about the protein by selecting the accession number. Further information and numbering can be found under the “protein analysis” tab.

Searching with the “Search by keyword” option on “CAR” or "Constitutive androstane receptor" will give a list of proteins. Selecting the + and searching for “mouse” in addition will give the correct protein. The protein name is O35627 of identifier NR1I3_MOUSE. You can click on the protein name. This will link to the protein detail page. See Fig 2, for the different points to click. The conversion between WT sequence numbering and 3D numbers can be found on this page under the "Protein analysis" tab. Residue T350 has 3D number 190, as shown in Fig 3.

Fig 2: Search by keyword layout and results.

Followed by:

It’s quite easy to find articles describing structural equivalent residues in homologous sequences with 3DM. Open the alignment page by selecting this item in the menu. In the upper right corner, you will find several options to customize the visualization of the alignment. The sequences displayed in the default "Consensus alignments” view give a nice quick overview of what trends can be seen in the alignment that the evolutionary pressures have left behind in the alignment. Click on alignment position 190 in the overall consensus. Consult the “Mutations” tab. Here you see a list of mutations that were retrieved from the literature.

Below the “Mutations” tab all the mutation data is shown in a table. A keyword search using the filter option at the top right of this table only searches in the text in the table. When entering the keyword "mouse" or "T350" it will result in several proteins among which one is the mouse CAR. If you click on “PubMed” you will be redirected to the abstract in PubMed. In this example, 10 PubMed articles are connected.

 

There are many articles describing a mutation at 3D position 190. A search for CAR will reveal that there are many articles describing a mutation in the human. You can see that there are 3 articles describing, for example, mutation P452L in this 3DM system.

 

Human CAR has an M at 3D position 190 and has residue number 272. Without the 3D numbers synchronizing the sequences in this protein family, this would be difficult to find this since it’s hard to find out that M272 is the structural equivalent of T350.

There are many papers describing mutations in the human CAR and we still haven’t found a paper reporting a mutation that affects specificity. You might just read all these papers, but there is an easier way to do this using YASARA or PyMOL, as explained in the next section. 

B: Easy visualization of data in structure files

In this second section, we look into visualization options of different structures and their characteristics. We will use YASARA or PyMOL to open the data we found in 3DM and use the 3DM plugin to visualize different data types. For this part, you'll need either PyMOL or YASARA installed which has the 3DM plugin. If you don't have either of these or when you are missing the 3DM functionality, please consult the installation instructions. Make sure you have the latest version installed before you start this exercise.

Structural biologists can confirm that the visualization of data in structures is time-consuming. 3DM offers many different ways to visualize different data types in any of the available structures. 3DM uses YASARA or PyMOL for this purpose.

  1. Go to the "Visualize" page. In the first 'Structures' section you will see the template structure 2OCFA. This structure is the system’s default template and is therefore selected by default. (You can change this by clicking the settings icon in the top right.)

  2. We won't use this structure in this demonstration, click on the clear selection button () to clear the selected structures.

  3. Click on "Add structure". A list of structures will appear below.

  4. We will select two templates from the structures list: 1G2NA and 1HG4A, they should now appear under the “Structures” section at the top. These two structures are part of a list of templates that were used to build the sequence alignments of the subfamilies.

  5. You can use the "Quick filter" to quickly select between different filter options. Select the "Show all" option in the quick filter.

  6. The first structure in the list should now be 1A28A. The last column (“Compounds”) shows the compound that is associated with this structure. Use the "Select compounds" button () to add the ligand compound “Progesterone”. You should now see Progesterone in the 'Compounds' section at the top.

  7. Selecting "Add positions" underneath the "Positions" section, provides you the option to select different data types that can be visualized in the selected structures (e.g. “Correlated mutations”, or “Conservation”). 

  8. Select the "Correlated mutations" tab and click the top selection box to select all the correlated mutations. Underneath the "Positions" section, 20 correlated positions have been added.

  9. Now go to the conservation tab and select the top selection box to select all the conserved positions. 15 conserved positions should be added underneath the "Positions" section.

If you followed the steps correctly you should see two structures (1G2NA and 1HG4A) in the "Structures" section, one Progesterone ligand in the "Compounds" section, 20 correlated and 15 conserved positions in the "Positions" section.

You can choose between the visualization programs YASARA or PyMOL with the option above the "Visualize" button. Your visualization program preference will be saved for future use.
Click the "Visualize" button to download the scene. It might take a few seconds to create your download. Save the file and open it with YASARA or PyMOL.

 

The scene will show two superimposed protein structures and one ligand. All protein structures and co-crystallized compounds are superimposed in 3DM. Therefore you can insert any of the co-crystallized compounds in any of the protein structures.

In large superfamily alignments correlated mutations (also called co-evolution of residues) are almost always functionally related. Residues that mutate simultaneously often share a function, that could be different functions. Sometimes correlated mutations are related to enzyme activity, enantioselectivity, or co-factor binding. Nevertheless, they are mostly related to changes in specificity. Since they are important for a certain function it created an evolutionary pressure that resulted in restricted mutation rates. If a function changes during evolution (e.g. the specificity of the enzyme changed) then the residues involved in this function need to mutate to facilitate this change (e.g. the binding of a new substrate).

 

 

The 3DM menu within YASARA and PyMOL contains options to visualize different data types in protein structures. The "Literature Hotspots" option is a really powerful tool we like to point out.

  • Select "Literature Hotspots" in the 3DM menu and click "Specificity", you might need to log in with your 3DM login details. It will select positions for which mutations have been reported in the literature to affect specificity.

Within YASARA it looks like this.

This will open a panel where the Literature hotspots are shown. The top 20 residue positions and their number of mutations related to specificity are displayed.

 

 

One article that contains mutation data related to specificity changes at position 183 is: "Broadened ligand responsiveness of androgen receptor mutants obtained by random amino acid substitution of H874 and mutation hot spot T877 in prostate cancer". Open the paper here. If you read the first sentence of the abstract you can see that the mutation indeed affects specificity. The title reveals that T877 is a prostate cancer hotspot. You wonder if there are mutations at other positions known to cause prostate cancer. How would you normally solve this problem? (no need to do this)

 

 

Look in YASARA or PyMOL if position 183 makes contact with the ligand. The best way to find out if position 183 is also a ligand binding hotspot, is to open all 789 available structure files and count the number of contacts this position makes with co-crystalized ligands. Within YASARA and PyMOL we can do this a lot easier.

  • In the 3DM menu, select "Show super-family data" and click on "Ligand contacts".

  • A small window will pop up where we can specify the positions and minimum amount of contacts. Keep the default settings for now, and click "OK".

  • In YASARA the HUD on the right displays a list that shows the selected data from 3DM.

  • In PyMOL you have to select the structure in which the program should show the ligand contacts. Just select 1G2NA_prot. Then use again the "3DM" → "show scene content details" option to get the list of ligand contacts.

 

Nuclear receptors can either be activated or inhibited by small molecules (3DM calls these ligands). Activating compounds are called agonists and inhibiting compounds are called antagonists. Say you would like to know where activating compounds bind, where inhibiting compounds bind, and if there is a difference. Normally this would take up quite some time, but within 3DM we can easily create subsets to visualize differences between subsets. 

C: Different search options and subset generation.

There are different search options available within 3DM to create a subset of sequences. In the "Search" menu we can search with different modules such as Search by keyword, BLAST, structures, or compounds. We can save a subset of all the sequences selected by the search options. We can access the subsets via the "Subsets" button in the top right, this will display the subset window. It is possible to create small 3DM systems of a selected subset in this side menu. All the 3DM options and scene visualizations are available for the smaller subsets. Below we explain how to generate and use the subsets.

We will use the subset options to investigate the different binding mechanisms between inhibiting (antagonist) and activating (agonist) ligands of Nuclear receptors. Therefore, we create two subsets, one subset containing proteins with agonists in the binding pocket, and one subset with proteins with antagonists in the binding pocket. The differences between the binding modes can be revealed by comparing the ligand binding positions between these subsets.

  1. In the "Search" menu select "Structures"

  2. Enter the keyword "antagonist" 

  3. Click on the "Subset" button in the right corner as visualized in the image above. This will show a yellow "Edit subset" field above the search results.

  4. Select "All proteins (211)" 

  5. Click on "Add to subset"

  6. This will result in 211 protein structures in the subset menu on the right site. It is possible to manually select some of the proteins by selecting the selection box in front of the PDB identifier.


    Some proteins bind both antagonists as well as agonists, we want to remove these proteins from the subset. Therefore we need to search for agonists as well and remove these from the subset. We do this by searching for “agonist” and selecting the "match whole word" option on the right side of the search box. This will make sure we only have results that completely match the search keyword, thereby it will exclude possible antagonists from appearing in the search results.

    Follow the steps below to remove agonists from the subset with the “match whole word” option.

  7. Search for the keyword "agonist"

  8. Select the "match whole word" option on the right side of the search box. This will exclude possible antagonists from appearing in the search results.

  9. Again select "All proteins (670)" in the yellow "Edit subset" box

  10. This time select "Remove from subset"
    This will remove possible proteins containing both agonists and antagonists from the subset

  11. The subset now has 159 proteins left in the subset. Remove the "agonists" from the subsets by repeating these steps

  12. Search for the keyword "agonists" with the "Match whole word" option

  13. Again select "All proteins (343)"

  14. And select "Remove from subset"

There are 157 proteins left in the subset. Give this subset the name "course inhibitors" and select "Save and generate new subset". This option will generate a small 3DM system for this subset. The option "Save new subset" will only save the proteins in the subset window for later use without making a mini 3DM for the subset. 

Next, we want to create a subset with the activation ligands.

  1. Select "NEW" in the top right corner of the subset menu to create a new subset.

  2. Deselect "Match whole word" 

  3. Search for the keyword "agonist"

  4. Select "All proteins (995)" in the yellow "Edit subset" field

  5. Select "Add to subset" Now 995 proteins are in the new subset. Again we need to remove possible proteins that contain both agonists and antagonists

  6. ! Select "Match whole word" option again 

  7. Search for the keyword "antagonist"

  8. Select "All proteins (170)" from the yellow "Edit subset" field

  9. Select "Remove from subset"
    This will leave 825 proteins in the subsets

  10. Search for the keyword "antagonists"

  11. Select "All proteins (29)" from the yellow "Edit subset" field

  12. Select "Remove from subset"
    If correct there are 808 proteins left in the new subset.

  13. Give the new subset the name "course activators"

  14. Click "Save and generate new subset" 

Please note that in this example we have many structures and therefore this trick works. In real life, you would manually add the chains we have not deleted with an activator or inhibitor to the correct subset. But for this course, this is too much work and we will take this simple shortcut for now. 

From the dropdown menu in the subset window on the top right of the page, choose the "course activators" subset - now all data used for visualizations will only include the sequences that are part of this subset.

 

To show the effects of subsets first go to the "Alignment statistics" page, this can be selected from the left side menu. Different data concerning the 3D numbers (x-axis) is visualized in different histograms. There are over 20 histograms, scroll down to have a look at them all. You can switch between subsets using the middle menu item at the top of 3DM, just underneath the name of the system. Using this selection menu you can see the "course activators" and "course inhibitors" subsets. The data in these data plots differs depending on the selected subset.

Select the "Custom plots" option underneath the alignment statistics. Here you can compare your own subsets on different data types, e.g. on correlated mutations. In the Subsets box on the left. select "course activators" as well as "course inhibitors", in the Datatype box on the right select "ligand contacts" and click the "Generate" button. 

A histogram is created that displays the ligand contacts for both subsets. This makes it possible to easily compare the ligand contact points of activators and inhibitors.

 

We can use the visualization option of 3DM as demonstrated before to get a detailed look at the contact positions within the protein structures. 

  1. Go to the "Visualize" option in the menu on the left.

  2. Select structure "1BSXA" and its compound "Liothyronine"

  3. Select "Visualize"

  4. Open the scene in YASARA or PyMOL.

You might need to switch to the "Full dataset" again under the subset at the top of the page to find this structure. Make sure to have the quick filter set to "Show all". If something is unclear please revise part B: Easy visualization of data in structure files. 

Locate the core 1BSXA residues with 3D numbers 31, 34, and 192, and have a look at their sidechains and contacts. Residues that are not part of a core region and are therefore located in a variable region have a "V" behind their 3D number. Make sure to select the residues of the core regions. 

  • In YASARA you can right-click on residues and select "show" → "residue" either from the structure itself or from the residue bar at the bottom of YASARA. It is also possible to color the residues via the "color" option.

  • In PyMOL:

    • Click on a residue inside the structure (or inside the residue bar at the top), it will be highlighted.

    • Click on the command bar on the bottom (indicated by `PyMol>`)

    • To show the selected residue, type: "show sticks, sele" and press Enter

    • Or, to color the selected residue type: "color red, sele" and press Enter

 

In YASARA or PyMOL we can load a structure and ligand from 3DM. Load the inhibited protein "1ERRA" in the scene by selecting the "3DM" option in the menu. Followed by selecting "Structures" and "Load structure from 3DM" option. Next, select the correct structure from the list. A second protein and ligand will be placed superimposed in the scene. 

 

Load the inhibited protein “1G2NA” in the scene to better understand how the inhibition of proteins works. 

D: Designing drugs with 3DM

Use "Search" → "Structures" from the side menu to find an androgen receptor structure with its natural ligand dihydrotestosterone bound in the ligand-binding pocket.

 

We like to visualize both this protein chain and the ligand. Go to the details page of the hit of the previous search (1I37A) by clicking on the protein ID. This link will go to the "protein detail page" of 1I37A in a new tab.

You can use the visualize icon next to the PDB name and chain () to go directly to the "visualize" module of 3DM where this PDB is already selected for you. Just click on "visualize" and this structure will download. (note that on some computers you need to close YASARA first before opening a new one). Once you have this structure loaded in YASARA or PyMOL use the “3DM” “Literature Hotspot” “Specificity” option to show the hotspots for specificity.

 

 

 

It is likely that the mutant of the androgen receptor T877A, which influences specificity, is an underlying cause of prostate cancer. It can also be likely that this mutation causes the possible activation by a different nuclear receptor ligand in humans that is similar to dihydrotestosterone (DHT). Progesterone is an example of a similar ligand. With all this information we can set the following hypothesis:
→ The mutation T877A of the human androgen receptor changes the specificity of the receptor, creating the possibility to be activated by progesterone, and resulting in the overactivity of the receptor causing cancer. Let's try to find further support for this hypothesis with 3DM.

 

Within YASARA or PyMOL, use the “3DM” → "structures" option to load the ligand of structure 1A28A (progesterone) and compare this ligand with DHT. It is possible that dihydrotestosterone (DHT) is already loaded in YASARA. If this is not the case load the ligand of 1I37A. Change the ligands to stick visualization for an easier comparison of the two ligands.

 

 

 

 

  • In YASARA you can make the mutation by right-clicking on the threonine and choosing “swap”“residue”“alanine”.

  • In PyMOL:

    • In the main PyMOL menu, click "Mutagenesis"

    • Select the residue you want to mutate.

    • On the right side of the screen, click on "No mutation" and select the residue you want to swap in (alanine).

    • Click apply.

 

Finally, let's sum up our findings and formulate a conclusion on the hypothesis.

After this experiment, it seems very likely that progesterone is capable of activating the T877A androgen receptor mutant. Most drugs that are used to treat prostate cancer are general androgen receptor inhibitors but are not specific or effective enough for treating prostate cancer caused by the T877A mutation. It would therefore be helpful to have an alternative drug that specifically targets the T877A androgen receptor.

We should therefore design a compound that could compete with the binding of progesterone in the T877A mutant but does not activate this mutated androgen receptor. We would have to make a compound that is similar to progesterone (since we now know progesterone can bind to this mutant) but has a larger group around position 183. Luckily, using simple chemistry, it is easy to add groups at the ester group (O=C-C-R) of progesterone that is near position 183.

If this idea succeeds we have designed a drug for treating prostate cancer in patients with the T877A mutation. Wouldn't that be great? Unfortunately, we are not the first to make this type of compound for treating prostate cancer. Several progesterone derivatives have been published as androgen T877A inhibitors. Nevertheless, the above process clearly shows how combining data from a 3DM system can be used as guidance in the early phases of drug design.