Multiple ActivitY Analyzer (MAYA) is designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces described by structural descriptors such as MACCS keys (166 bits), ECFP 4 and 6, molecular descriptors with pharmaceutical relevance as well as implementing biological descriptors. These representations are integrated with various visualization techniques for automated analysis, focusing on the analysis of structure - multiple activiy/property relationships (SMARTs).
MAYA has been developing as a user-friendly, open-source tool that automates the construction of chemical spaces by integrating diverse molecular representations to provide a more comprehensive description of the structural, chemical, and functional characteristics of a given set of molecules described by their SMILES notation and an associated activity/property, and the tool supports various file formats (CSV, TSV, XLSX, JSON and XML), requiring only the specification of a few parameters related to the database in use and the desired representations. Additionally, MAYA includes options for customizing the visualizations.
The generated visualizations are interactive, enhancing the understanding of the displayed data. They offer a 2D view of the molecular structure, along with the obtained variability values from PCA and their SMILES notation. Customization features allow users to adjust the size, shape, and transparency of data points, as well as the ability to modify the color palette.
The script consist in a funtion that automatically implement:
- Data curation
- Descriptors calculation
- Tanimoto simmilarity calculation
- Dimensionality reduction
- 2D interactive visualization
Here you can find more detailed information about how MAYA works
Important
It is essential to ensure our dataset contains the following information:
- Smiles notation
- Identifier
- Activity or property values
Depending on the user's interests, it is possible to select specific descriptors and dimensionality reduction thecniques to use. By setting variables as True or False, users can enable or disable their calculation.
# This is an example
chemical_multiverse(dataset='/content/example.csv', smiles_column_name='SMILES', target_activities=['Target_1', 'Target_2', 'Target_3'], MACCS=Falce, ECFP=True, MD=Falce, vPCA=True, t-SNE=True )
See this notebook for more detailed usage
To perform an automated analysis of your database annotated with any activity, property, or score by constructing a chemical multiverse focused on a deeper understanding of multiple structure-activity relationships.
You can customize the descriptors and techniques used depending on the required focus. You can select which descriptors you want to use, and you can also input a similarity matrix of any desired descriptor, allowing its integration into the generated visualizations.
Access to well-documented code is provided, covering database curation processes, similarity calculations, and dimensionality reduction techniques.
- Google Colaboratory
The easiest way to use the script is ti open it in Google Colaboratory. The only thing needed is a Google account. - Local installation
You can also setup your own local environment if you do not want to run the script through a Google service.
MAYA current supports Pythob 3.10
rdkit (2022.09.05)
matplotlib (3.7.1)
pandas (2.1.4)
seaborn (0.13.1)
sklearn (1.3.2)
Research contained in this package was supported by the Consejo Nacional de Humanidades, Ciencia y Tecnología (CONAHCYT) for the scholarship No. CVU 1340927