Atlas

What is the Center for Accelerated Drug Discovery?

The American Heart Association is committed to convening the best collaborators in science, technology, computational biology, engineering and other fields to accelerate the development of lifesaving treatments for patients. We’ll accomplish our goal by using the most advanced and powerful systems at the intersection of science and technology.

Together the AHA and Lawrence Livermore National Laboratory are creating datasets and tools that will take advantage of unique national resources. For example, it’s using one of the world's top-ranked supercomputers to discover potential protein-drug target interactions rapidly in an unbiased fashion and to predict adverse drug reactions for candidate compounds. The result will be new therapies that come to fruition more quickly. By leveraging world-class, bleeding-edge technology, we can create a comprehensive, open-access reference atlas of cell-protein targets to accelerate and hone drug discovery.

What is the Protein Binding Atlas?

It’s a robust database of protein models and their associated docking interactions with the small molecule library. Since 2017 the Center for Accelerated Drug Discovery has engaged in research resulting in:

A significant database of nearly 12,000 human protein models, instrumented for molecular screening. These models represent biologically-relevant protein assemblies. Therefore about 9,000 models represent homo- and hetero-dimers and larger constructs.
High-performance artificial intelligence/machine learning software optimized for high-performance computing platforms, enabling modeling of small-molecule binding to proteins.
A library of about 2 million small molecules available for binding calculations; currently docking/binding calculations against the entire protein database have been completed for approximately 500,000 molecules.

What data is currently in the Protein Binding Atlas?

As mentioned above, the database holds nearly 12,000 human protein models and 2 million small molecules. Data for each molecule includes a set of calculated features – “molecular descriptors,” from the simple (e.g., molecular weight) to the complex (e.g., polar surface area). The molecules were drawn from these compound sets:

ChEMBL 28 : 1.9 million.
Broad approved and developmental drugs : 6,550.
Foodome : 24,144.
Marine : 867.

Over time additional protein models can be added to the Protein Binding Atlas. The library of small molecules also can be augmented and docking/binding calculations can be performed against the proteins.

How does this Protein Binding Atlas differentiate itself from others?

The CADD Protein Binding Atlas has the unique ability to pair with the computational and software tools the Center for Accelerated Drug Discovery has created, making it particularly valuable. This Protein Binding Atlas is curated to model protein small-molecule compound interactions, which differentiates it from others in the space.

How do I access the Protein Binding Atlas?

The Protein Binding Atlas can be accessed through this link. Currently the site is open to all.

How do I search for the protein and its related binders?

General protein search:

From the protein query page, enter or select search criteria and click Search button. Click on protein result meeting search criteria. See Docked Compounds list.

UniProt ID protein search:

From the protein query page, enter the human reference UniProt ID in the Protein Search text box and click Search button. Click on protein result meeting search criteria. See Docked Compounds list.

UniProt IDs “P04035,Q9UHC9” example:

From the protein query page, enter the human reference UniProt IDs “ P04035,Q9UHC9 ” in the Protein Search text box and click Search button. Click on “ NPC1-like intracellular cholesterol transporter 1 ”. See Docked Compounds list. Note ezetimibe is a known drug target.

“Cholesterol Biosynthesis” Pathway example:

From the protein query page, select “cholesterol biosynthesis” Pathway and click Search button. Click on “ 7-dehydrocholesterol reductase ”. See Docked Compounds list. Note coenzyme-I is a known drug target.

What if I cannot find the protein that I am interested in?

First try searching by the human reference UniProt id as described inUniProt ID protein search under the previous question.

It’s possible your protein of interest may not be available in the set of 12,000 human protein models. Please contact us about your protein of interest.

Can the complete Protein Binding Atlas be downloaded? Can you put it on a USB drive?

The Protein Binding Atlas is a finite piece of knowledge in time. It contains about 200 terabytes of data, so all the data will not fit on a USB drive. But subsets of the data can be downloaded (enabled through better search tools in the data portal) if they are well filtered/curated and then loaded onto a USB drive or another server.

How can I download the data that I’m interested in?

From the compound, protein, or predict protein binding query and detail pages, the “Download All” button or links will provide a metadata csv file for the search or listing results. From the protein and predict protein binding, the “Download Structure or PDB File” buttons will provide PDB formatted files.

Please see Downloads on info page.

Can I save the results of my search?

You can use the download options listed in the previous question to download and save the results. You can copy the query url to recreate the search results.

How confident can I be about the predictions from the protein portal?

Multiple docking poses were assessed. Poses with scores that meet the cutoff (Docking: < -7.5 kcal/mol, CoherentFusion > 7.5, GBSA < -30) have a higher confidence.

How do I cite the source?

Please cite the AHA Protein Portal DOI: 10.11578/1969730.

Who can I contact if I have an inquiry?

Please contact us with your inquiry. We’re interested in discussing potential collaborations.

Who can I contact to share my experience using the Protein Binding Atlas portal?

Please contact us to share about your experience. We’re interested in feedback that may help address your use case.