Introduction- Quantitative structure–activity relationship models (QSAR models) are regression or classification models used in the chemical and biological sciences and engineering. Like other regression models, QSAR regression models relate a set of “predictor” variables (X) to the potency of the response variable (Y), while classification QSAR models relate the predictor variables to a categorical value of the response variable.
In QSAR modeling, the predictors consist of physico-chemical properties or theoretical molecular descriptors of chemicals; the QSAR response-variable could be a biological activity of the chemicals. QSAR models first summarize a supposed relationship between chemical structures and biological activity in a data-set of chemicals. Second, QSAR models predict the activities of new chemicals.
Related terms include “Quantitative Structure–Property Relationships” (QSPR) when a chemical property is modeled as the response variable. Different properties or behaviors of chemical molecules have been investigated in the field of QSPR. Some examples are “Quantitative Structure–Reactivity Relationships” (QSRRs), “Quantitative Structure–Chromatography Relationships” (QSCRs) and, “Quantitative Structure–Toxicity Relationships” (QSTRs), “Quantitative Structure–Electrochemistry Relationships” (QSERs), and “Quantitative Structure–Biodegradability Relationships” (QSBRs).
As an example, biological activity can be expressed quantitatively as the concentration of a substance required to give a certain biological response. Additionally, when physicochemical properties or structures are expressed by numbers, one can find a mathematical relationship, or quantitative structure-activity relationship, between the two. The mathematical expression, if carefully validated can then be used to predict the modeled response of other chemical structures.
A QSAR has the form of a mathematical model:
Activity = f (physiochemical properties and/or structural properties) + error
The error includes model error (bias) and observational variability, that is, the variability in observations even on a correct model.
SAR & SAR PARADOX:- The basic assumption for all molecule based hypotheses is that similar molecules have similar activities. This principle is also called Structure–Activity Relationship (SAR). The underlying problem is therefore how to define a small difference on a molecular level, since each kind of activity, e.g. reaction ability, biotransformation ability, solubility, target activity, and so on, might depend on another difference. Examples were given in the bioisosterism reviews by Patanie/LaVoie and Brown.
TYPES OF QSAR TECHNIQUES:-
1. Fragment Based (Group Contribution): Analogously, the “Partition Coefficient” a measurement of differential solubility and itself a component of QSAR predictions—can be predicted either by atomic methods (known as “XLogP” or “ALogP”) or by chemical fragment methods (known as “CLogP” and other variations). It has been shown that the logP of compound can be determined by the sum of its fragments; fragment-based methods are generally accepted as better predictors than atomic-based methods. Fragmentary values have been determined statistically, based on empirical data for known logP values. This method gives mixed results and is generally not trusted to have accuracy of more than ±0.1 units.
Group or Fragment based QSAR is also known as GQSAR. GQSAR allows flexibility to study various molecular fragments of interest in relation to the variation in biological response. The molecular fragments could be substituents at various substitution sites in congeneric set of molecules or could be on the basis of pre-defined chemical rules in case of non-congeneric sets. GQSAR also considers cross-terms fragment descriptors, which could be helpful in identification of key fragment interactions in determining variation of activity. Lead discovery using Fragnomics is an emerging paradigm. In this context FB-QSAR proves to be a promising strategy for fragment library design and in fragment-to-lead identification endeavours.
An advanced approach on fragment or group-based QSAR based on the concept of pharmacophore-similarity is developed. This method, pharmacophore-similarity-based QSAR (PS-QSAR) uses topological pharmacophoric descriptors to develop QSAR models. This activity prediction may assist the contribution of certain pharmacophore features encoded by respective fragments toward activity improvement and/or detrimental effects.
2. 3D – QSAR: The acronym 3D-QSAR or 3-D QSAR refers to the application of force field calculations requiring three-dimensional structures of a given set of small molecules with known activities (training set). The training set needs to be superimposed (aligned) by either experimental data (e.g. based on ligand-protein crystallography) or molecule superimposition software. It uses computed potentials, e.g. the Lennard-Jones potential, rather than experimental constants and is concerned with the overall molecule rather than a single substituent. The first 3-D QSAR was named Comparative Molecular Field Analysis (CoMFA) by Cramer et al. It examined the steric fields (shape of the molecule) and the electrostatic fields which were correlated by means of Partial Least Squares Regression (PLS).
The created data space is then usually reduced by a following feature extraction (see also dimensionality reduction). The following learning method can be any of the already mentioned machine learning methods, e.g. support vector machines. An alternative approach uses multiple-instance learning by encoding molecules as sets of data instances, each of which represents a possible molecular conformation. A label or response is assigned to each set corresponding to the activity of the molecule, which is assumed to be determined by at least one instance in the set (i.e. some conformation of the molecule).
3. CHEMICAL DESCRIPTOR BASED: In this approach, descriptors quantifying various electronic, geometric, or steric properties of a molecule are computed and used to develop a QSAR. This approach is different from the fragment (or group contribution) approach in that the descriptors are computed for the system as whole rather than from the properties of individual fragments. This approach is different from the 3D-QSAR approach in that the descriptors are computed from scalar quantities (e.g., energies, geometric parameters) rather than from 3D fields. An example of this approach is the QSARs developed for olefin polymerization by half sandwich compounds.
APPLICATIONS OF QSAR STUDIES:-
1. Chemical Applications: One of the first historical QSAR applications was to predict boiling points. It is well known for instance that within a particular family of chemical compounds, especially of organic chemistry, that there are strong correlations between structure and observed properties. A simple example is the relationship between the number of carbons in alkanes and their boiling points. There is a clear trend in the increase of boiling point with an increase in the number carbons, and this serves as a means for predicting the boiling points of higher alkanes. A still very interesting application is the Hammett equation, Taft equation and pKa prediction methods.
2. Biological Applications: The biological activity of molecules is usually measured in assays to establish the level of inhibition of particular signal transduction or metabolic pathways. Drug discovery often involves the use of QSAR to identify chemical structures that could have good inhibitory effects on specific targets and have low toxicity (non-specific activity). Of special interest is the prediction of partition coefficient log P, which is an important measure used in identifying “drug likeness” according to Lipinski’s Rule of Five.
While many quantitative structure activity relationship analyses involve the interactions of a family of molecules with an enzyme or receptor binding site, QSAR can also be used to study the interactions between the structural domains of proteins. Protein-protein interactions can be quantitatively analyzed for structural variations resulted from site-directed mutagenesis.
Mr. Saurabh Chaturvedi, Assistant Professor, School of Pharmacy, Career Point University, Kota