Projects • 


Full Title
Mining the Molecular Metric Space for Drug Design

The central goal of this proposal is to improve the molecular similarity algorithm NAMS (Non-contiguous Atom Matching Structural) by providing topological enhancements to the main atom matching algorithm so that the tool is able to both give pharmacologists and molecular biologists a direct understanding of the relevant components in a family of active compounds; and secondly to use this newly derived tool as a standalone tool for the screening phases of drug development programs.

NAMS was designed to compare full molecules, thus in a way tapping the graph isomorphism problem and providing an extremely reliable polynomial solution.
However for a molecule to function as a drug, many times only specific parts of its structure (not necessarily contiguous) are necessary. Thus it is envisaged that NAMS can be modified by allowing the weighting of specific compositional and structure
elements that are deemed to be essential for understanding a molecule pharmacological properties.

In this proposal it is aimed to extend NAMS allowing this algorithm to differentiate between parts of the molecule allowing for the discovery of the most relevant parts as well as its differential amplification for molecular property prediction ad drug inference. Differently from other methods NAMS is not bound by the existence of any type of chemical descriptors and can conceivably be used in any chemical property prediction problem. The current inference engine, developed by the research team, is based on kriging over the global similarity provided by NAMS, and although providing results on the level of the best state-of-the-art QSAR algorithms using virtual no information other than the molecular structure, we believe that a topological differentiation mechanism will be key to extend NAMS over even more different molecules that share pharmacological characteristics. The current version of the inference engine requires no learning, as it is a kriging algorithm, taking advantage of the molecular metric space; it is further expected that the new version will be also able to directly assess the relevant topological characteristics of the most active molecules known and be able to use this intrinsic knowledge to retrieve from large molecular databases the compounds more likely to have the desired characteristics.

The research team for the current project is an assembly of computer scientists and biochemists, pharmacologists and molecular biologists with world leading expertise in the fields of cheminformatics, molecular modeling, bloodbrain barrier permeability and cystic fibrosis.

This work will be divided in 3 major tasks, and each will be critical for the advancement of the project.

  1. It is necessary to do a thorough evaluation of the best existing QSAR (Quantitative Structure Activity Relationship) methodologies and compare the current inference engine based on NAMS and Kriging. This comparison is supposed to be as exhaustive as possible, by testing each existing method over a set of benchmarks created from data collected in publicly available databases.
  2. NAMS is to be adapted for including differential topological features in assessing chemical structural similarity and then this adaptation is to be optimized and tested within a novel framework, centered on Bayesian learning. The purpose is the empirical identification of the molecular topological characteristics that for each specific benchmark problem.
  3. The topological enhanced model is to be put to the test over two distinct problems, for which there are world leading experts within the team. The first problem is the development of new lead compounds for enhancing trafficking of F508delCFTR to the plasma membrane. This problem is in the center of current drug development for cystic fibrosis. The second problem aims to determine whether a new molecule has the potential for crossing the BloodBrain Barrier. This issue is critical in most drug development programs for drugs that target the central nervous system.
Funding Entity
Start Date
End Date
FFCUL (LaSIGE); FFCUL (BioISI); FARMID - Associação da Faculdade de Farmácia para a Investigação e Desenvolvimento (FARMID)
Team at LASIGE