My current research interests are in the area of computational chemistry and computer-assisted drug design, cheminformatics, bioinformatics, QSARomics and environmental toxicity as discussed below:

                                                                                                  

Computational chemistry and Computer-Assisted Drug Design:

 

Identification of potential drug molecules (i.e., lead discovery and optimization) accounts for about one-third of the drug development cost, besides the considerable amount of time. Computer-assisted drug design (CADD) techniques play a major role in the lead discovery and optimization as it significantly reduces the time and cost. Pharmaceutical companies are, therefore, investing resources in these techniques. QSAR (Quantitative Structure-Activity Relationship) analyses and receptor-ligand docking models are widely used CADD techniques. QSAR models have been applied to analyze/elucidate relationships between structure and activity of biologically active compounds.

 

In my research, we are studying HIV-protease inhibitors. Protease is one of the key viral enzymes needed for HIV reproduction.  Many drugs have successfully been developed to inhibit this enzyme.  However, the virus' fast reproduction cycle and tendency to mutate necessitates a constant development of new drugs.  We are developing linear and nonlinear QSAR models using statistical and machine learning techniques. The potential molecules identified using these models are further studied by using receptor-ligand docking. This research provides mechanistic insight about how a potential molecule (ligand) interacts with the receptor (protein). It also provides clues for further developing the candidate drug molecules for improved biological activity and pharmacokinetic profile. Using QSAR and receptor-ligand docking, we are also investigating estrogen receptor ligands that cause breast cancer.

 

Cheminformatics and Bioinformatics:

 

Cheminformatics and bioinformatics involve data mining, molecular modeling (docking), QSAR, pharmacophore mapping, structure/substructure searching etc. for predicting biological activity and other properties from chemical structure. Lately, many machine learning and engineering approaches such as artificial neural network (ANN), support vector machine (SVM), genetic algorithm (GA), principal component analysis (PCA), decision tree, data mining, pattern recognition, shape analysis and 3D graphics are also being increasingly applied for multi-modality data analysis in order to understand the drug-receptor interaction.

 

One of my projects is on development of quality assured (QA) databases for descriptor calculation, feature selection and model development. In another project, I am investigating development of new and traditional descriptors to create improved QSAR models that characterize and predict important biological responses. I am looking into topological, geometrical and chiral descriptors (with Dr. Basak, U. Minnesota, Duluth); and 3D-graphics, wavelet and shape related descriptors (with Prof. Kuo, USC Los Angeles; and Prof. Kumar, SDSU, San Diego). Once the descriptors have been determined and a predictive model has been built, thousands of new potential molecules, chemically similar to those of the benchmark data set, can be scanned from large databases and evaluated for their chemical properties based on the predictive model. The aim is to find a few novel molecules with potentially attractive pharmaceutical properties that can then be synthesized & tested further in the laboratory.

 

Combinatorial QSARomics:

 

 I have recently initiated a project in which we are studying fusion/hybridization of machine learning techniques (i.e., GA and PCA for classification and feature selection followed by ANN or SVM techniques for pattern recognition), in combination with statistical regression analysis. We are developing robust computational models for rapid and reliable prediction of biological activity of HIV protease inhibitors. Comparative analysis of QSAR models developed using ANN/SVM with MLR/PLS analyses will bring out the similarities and differences in these models and provide lead for development of new drugs active against emerging mutant virus. The long-term objective of this research is to develop novel computational models as virtual screening tools for data mining of drug molecules from large databases.

 

Environmental Toxicity:

 

Computer-assisted procedures are effective in prescreening and prioritizing large numbers of compounds and in predicting their biological activity/toxicity rapidly and inexpensively. US EPA is very interested in developing quality assured (QA) databases for predicting the toxicity of endocrine disruptive agents (EDA). Both synthetic (pesticide, food anti-oxidants, polyphenols etc.) and natural (such as plant and mold metabolites) EDAs can interfere with the hormones in our system. One of my projects is on developing QA databases and predicting the activity/toxicity of these endocrine disruptive agents interacting with estrogen receptor (hormone in women responsible for breast cancer).  In this research, we plan to construct QA database of estrogen receptor ligands, calculate descriptor and develop models using various statistical and machine learning techniques.

 

Environmental toxicants in cigarette smoke are of great concern. Cigarette Mainstream Smoke (CMS) is a complex mixture and contains >500 polyaromatic hydrocarbons (PAH) which have been identified as carcinogenic by IARC. We have been collaborating with researchers at Lorillard Tobacco Company, Greensborough, NC, to develop molecular parameters and QSAR models for predicting carcinogenesis of these PAHs. In collaboration with Lorillard and USEPA health facility (UNC, Chapel Hill), we are also studying carbonyls in diesel exhaust. Many carbonyls are defined as irritants, mutagens and carcinogens. Several carbonyls are listed among the 188 hazardous air pollutants that the USEPA is required to control under the 1990 Clean Air Act.  Emitted from both mobile and non-mobile sources, diesel exhaust is a major contributor of carbonyls in the air. The goal of this project is to compute the molecular parameters and develop QSAR models to help in identifying toxicity of untested carbonyls.

 

My most recent project (with Prof. Partch, Clarkson Univ.) is on identification and development of ricin toxin inhibitors. Ricin is a potent cytotoxin easily isolated from the seeds of the castor plant. Its toxic dose for humans is in the microgram/kg, which ranks it among the most toxic substances known. This protein has been widely used in the design of therapeutic immunotoxins. Recently, governments and underground groups have used it as a poison. Thus, there is a great interest in identifying and designing effective inhibitor of the ricin A chain (RTA) protein. In our research we are calculating molecular parameters /descriptors of some potential RTA inhibitors. The molecules showing promising parameter values will be docked in the binding site of RTA to gain insight about the binding-interaction pattern. The outcome of this research can provide lead for further development of these molecules.