Skip to content
/ MAYA Public

Module containing an script to perform a chemical multiverse integrating several molecular representations to generate multipe chemical spaces to provide a depper analysis of structure multiple activity relationships

Notifications You must be signed in to change notification settings

IsrC11/MAYA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAYA

Open In Colab

Multiple ActivitY Analyzer (MAYA) is designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces described by structural descriptors such as MACCS keys (166 bits), ECFP 4 and 6, molecular descriptors with pharmaceutical relevance as well as implementing biological descriptors. These representations are integrated with various visualization techniques for automated analysis, focusing on the analysis of structure - multiple activiy/property relationships (SMARTs).

Process

MAYA has been developing as a user-friendly, open-source tool that automates the construction of chemical spaces by integrating diverse molecular representations to provide a more comprehensive description of the structural, chemical, and functional characteristics of a given set of molecules described by their SMILES notation and an associated activity/property, and the tool supports various file formats (CSV, TSV, XLSX, JSON and XML), requiring only the specification of a few parameters related to the database in use and the desired representations. Additionally, MAYA includes options for customizing the visualizations.

The generated visualizations are interactive, enhancing the understanding of the displayed data. They offer a 2D view of the molecular structure, along with the obtained variability values from PCA and their SMILES notation. Customization features allow users to adjust the size, shape, and transparency of data points, as well as the ability to modify the color palette.

The script consist in a funtion that automatically implement:

  1. Data curation
  2. Descriptors calculation
  3. Tanimoto simmilarity calculation
  4. Dimensionality reduction
  5. 2D interactive visualization

Here you can find more detailed information about how MAYA works

How use MAYA?

Important

It is essential to ensure our dataset contains the following information:

  1. Smiles notation
  2. Identifier
  3. Activity or property values

Depending on the user's interests, it is possible to select specific descriptors and dimensionality reduction thecniques to use. By setting variables as True or False, users can enable or disable their calculation.

Example of usage

# This is an example
chemical_multiverse(dataset='/content/example.csv', smiles_column_name='SMILES', target_activities=['Target_1', 'Target_2', 'Target_3'], MACCS=Falce, ECFP=True, MD=Falce, vPCA=True, t-SNE=True )

Process

See this notebook for more detailed usage

Why use MAYA?

To perform an automated analysis of your database annotated with any activity, property, or score by constructing a chemical multiverse focused on a deeper understanding of multiple structure-activity relationships.

You can customize the descriptors and techniques used depending on the required focus. You can select which descriptors you want to use, and you can also input a similarity matrix of any desired descriptor, allowing its integration into the generated visualizations.

Access to well-documented code is provided, covering database curation processes, similarity calculations, and dimensionality reduction techniques.

Usage

  1. Google Colaboratory
    The easiest way to use the script is ti open it in Google Colaboratory. The only thing needed is a Google account.
  2. Local installation
    You can also setup your own local environment if you do not want to run the script through a Google service.

Additional Information

MAYA current supports Pythob 3.10

rdkit (2022.09.05)

matplotlib (3.7.1)

pandas (2.1.4)

seaborn (0.13.1)

sklearn (1.3.2)

Funding

Research contained in this package was supported by the Consejo Nacional de Humanidades, Ciencia y Tecnología (CONAHCYT) for the scholarship No. CVU 1340927

About

Module containing an script to perform a chemical multiverse integrating several molecular representations to generate multipe chemical spaces to provide a depper analysis of structure multiple activity relationships

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages