Quantitative Structure Activity Relationship.pptx

3,629 views 50 slides Apr 10, 2023
Slide 1
Slide 1 of 50
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50

About This Presentation

This ppt is about 3D-QSAR and their classification.


Slide Content

Quantitative Structure Activity Relationship (QSAR): Statistical method and Product concept Presented by: Radha Sureshrao Chafle F . Y. M. Pharm. (Semester II) (Pharmacology) Guided by: Dr. Mrs. Vandana S. Nikam HOD, Associate Professor in Pharmacology

What is QSAR? QSAR is a mathematical relationship which describe the structural dependence of biological activities either by physicochemical parameters, by indicator variables encoding different structural features , or by three-dimensional molecular property profiles of the compounds.

Contd . Drugs, which exert their biological effects by interaction with a specific target must have : a three-dimensional structure, which in the arrangement of its functional groups and in its surface properties is more or less complementary to a binding site. Better the steric fit and complementarity of the of the surface properties of a drug to its binding site are, the higher its affinity will be and the higher may be its biological activity.

Classical QSAR analyses consider only 2D structures main field of application is in substituent variation of a common scaffold. 3D QSAR has a much broader scope . starts from 3D structures and correlates biological activities with 3D-property fields.

History & Development of QSAR 1868, Crum-Brown and Fraser : Published an equation f(C) i.e. assumption of physiological activity as function of the chemical structure C. 1900 , H. H. Meyer and C. E. Overton : lipoid theory of narcosis 1930‘s, L. Hammett : electronic sigma constants 1964, C. Hansch and T. Fujita : QSAR 1984, P. Andrews : affinity contributions of functional groups 1985, P. Goodford : GRID (hot spots at protein surface ) 1988, R. Cramer : 3D QSAR 1992, H.-J. Bohm : LUDI interaction sites, docking, scoring 1997 , C. Lipinski : bioavailability rule of five 1998, Ajay, W. P. Walters and M. A. Murcko; J. Sadowski and H. Kubinyi : drug-like character  

Basic Requirements in QSAR Studies all analogs belong to a congeneric series all analogs exert the same mechanism of action all analogs bind in a comparable manner the effects of isosteric replacement can be predicted binding affinity is correlated to interaction energies biological activities are correlated to binding affinity

Molecular Properties and Their Parameters Molecular Property Corresponding Interaction Parameters Lipophilicity hydrophobic interactions log P, R M , Polarizability van-der-Waals interactions MR, parachor, MV Electron density ionic bonds, dipole-dipole interactions, hydrogen bonds, charge transfer interactions σ, R, F, κ, quantum chemical indices Topology steric hindrance geometric fit E s , r v , L, B, distances, volumes Molecular Property Corresponding Interaction Parameters Lipophilicity hydrophobic interactions Polarizability van-der-Waals interactions MR, parachor, MV Electron density ionic bonds, dipole-dipole interactions, hydrogen bonds, charge transfer interactions σ, R, F, κ, quantum chemical indices Topology steric hindrance geometric fit E s , r v , L, B, distances, volumes

QSAR models Hansch model (property-property relationship ): Definition of the lipophilicity parameter π π X = log P RX - log P RH where P RX represents the partition coefficient between n- octanol and water and P RH that of the parent compound. Linear Hansch model Log 1/C = a log P + b σ + c MR + ... + k Nonlinear Hansch models log 1/C = a (log P) 2 + b log P + c σ + ... + k log 1/C = a π 2 + b π + c σ + ... + k log 1/C = a log P - b log (ßP + 1) + c σ + ... +

Contd. Free-Wilson model (structure-property relationship) log 1/C = Σ a i + µ a i = substituent group contributions µ = activity contribution of reference compound Mixed Hansch/Free-Wilson model log 1/C = a (log P) 2 + b log P + c σ + ... + Σ a i + k log 1/C = a log P - b log (ßP + 1) + c σ + ... + Σ a i + k

Lipophilicity Hydrophobic interaction between a drug and a binding site at a receptor Definition of Partition Coefficients P = c org /c aq (n-octanol/water system)

n-Octanol/Water as a Standard System membrane analogous structure hydrogen bond donor and acceptor practically insoluble in water no desolvation on transfer into organic phase very low vapor pressure transparent in the UV region large data base of log P values

Additivity Principle of π Values(C. Hansch , 1964) π X = log P R-X - log P R-H The lipophilicity parameter π is an additive, constitutive molecular parameter; compare the Hammett Equation: ρ σ X = log K R-X - log K R-H

Non additivity of π Values intramolecular hydrogen bonds ortho effects (phenols) polysubstituted aromatic compounds conjugation (push pull effect) Heterocyclic compound C yclophanes

Hydrophobic Fragmental Constants f The  hydrophobic  fragmental  constant  of a substituent or molecular  fragment  represents the lipophilicity contribution of that molecular  fragment   . (R . Rekker , The Hydrophobic Fragmental Constant, Elsevier, Amsterdam 1977; R. Rekker . Eur. J. Med. Chem. 14, 479 (1979 ) log P = Σ a i f i (R . Rekker, 1973 ) Experimental Determination of Log P Values - Shake flask method - Reversed phase thin layer chromatography - High performance liquid chromatography (HPLC)

Polarizability Parameters Molar volume, Molar Refractivity, Parachor MV = MR = PA = 1/4 d = density; n = refraction index; γ = surface tension (MR is most often scaled by a factor of 0.1)  

Electronic Parameters Hammett Equation ρσ = log K RX - log K RH Calculation of pKa Values pKa R-X = pKa R-H - ρσ pKa value of 3,5-dinitro-4-methyl-benzoic acid ( pKa benzoic acid = 4.20) experimental value = 2.97 calculated value = 4.20 - (0.71 - 0.17 + 0.71) = 2.95

Quantum Mechanical Descriptors Atom partial charges : Mulliken population analysis (orbital population) ESP charges (mapping EP to atom locations ) Dipole moment: strength and orientation behavior of a molecule in an electrostatic field HOMO / LUMO (“frontier orbital theory“): HOMO = energy of highest occupied molecular orbital, “ nucleophilicity’’ LUMO = energy of lowest unoccupied molecular orbital, “ electrophilicity” Superdelocalizability : estimate for the reactivity of positions in aromatic hydrocarbon

3D QSAR 3D QSAR is an extension of classical QSAR which exploits the 3 dimensional properties of the ligands to predict their biological activity using robust statistical analysis like PLS, G/PLS, ANN etc. 3D QSAR uses probe based sampling within a molecular lattice to determine three-dimensional properties of molecules and can then correlate these 3D descriptors with biological activity. Some of the major factors like desolvation energetics, temperature, diffusion, transport, pH, salt concentration etc. which contribute to the overall free energy of binding are difficult to handle, and thus usually ignored.

On the basis of intermolecular bonding On the basis of alignment criterion On the basis of chemometric techniques used Ligand Based 3D QSAR For e.g. CoMFA, CoMSIA, COMPASS, CoMMA, SoMFA Receptor Based 3D QSAR For e.g. COMBINE, AFMoC, HIFA, CoRIA Linear 3D QSAR For e.g. CoMFA, CoMSIA, AFMoC, GERM, CoMMA, SoMFA Classification of 3D QSAR Alignment dependent 3D QSAR For e.g. CoMFA, CoMSIA, GERM, COMBINE, AFMoC , HIFA, CoRIA Alignment independent 3D QSAR For e.g. COMPASS, CoMMA , HQSAR, WHIM, EVA/ CoSA , GRIND

Comparative Molecular Field Analysis (CoMFA) The Scientist named Cramer developed the predecessor of 3D approaches called Dynamic Lattice Oriented Molecular Modeling System (DYLOMMS) that involves the use of PCA to extract vectors from the molecular interaction fields, which are then correlated with biological activities in 1987. CoMFA, powerful 3D QSAR methodology is a combination of GRID and PLS.

Protocol for CoMFA Determination of Bioactive conformations of the molecule. Superimposition or the alignment of molecules using either manual or automated methods, in a manner defined by the supposed mode of interaction with the receptor. The steric and electrostatic fields calculated around the molecules with different probe groups positioned at all interactions of the lattice. The overlaid molecules are placed in the center of a lattice grid with a spacing of 2 Å.

Contd. The PLS technique is used to correlate the interaction energy or field values with the biological activity, by which the quantitative influence of specific chemical features of molecules on their biological can be identified and extracted. The results are coupled as correlation equations with the number of latent variable terms, each of which is a linear combination of original independent lattice descriptors.

Steps in CoMFA: a set of molecules is first selected. all molecules have to interact with the same kind of receptor (or enzyme, ion channel, transporter ) in the same manner, i.e., with identical binding sites in the same relative geometry . a certain subgroup of molecules is selected which constitutes a training set to derive the CoMFA model . The residual molecules are considered to be a test set which independently proves the validity of the derived model(s ).

Atomic partial charges are calculated and (several) low energy conformations are generated. A pharmacophore hypothesis is derived to orient the superposition of all individual molecules and to afford a rational and consistent alignment. A sufficiently large box is positioned around the molecules and a grid distance is defined . PLS analysis is the most appropriate method for this purpose. Normally, cross-validation is used to check the internal predictivity of the derived model.

Pharmacophore hypotheses and alignment:

Drawbacks of CoMFA: Too many adjustable parameters Uncertainty in selection of compounds and variables. Fragmented contour maps with variable selection procedures. Hydrophobicity not well quantified Cut-off limits used. Low signal to noise ratio due to many useless field variables. Imperfections in potential energy funtions . Applicable only to in vitro data .

Comparative Molecular Similarity Indices Analysis (CoMSIA) Molecular similarity indices are calculated from modified SEAL similarity fields are employed as descriptors to simultaneously consider steric, electrostatic, hydrophobic and hydrogen bonding properties. These indices are estimated indirectly by comparing the similarity of each molecule in the dataset with a common probe atom (having a radius of 1Å, charge of +1 and hydrophobicity of +1) positioned at the intersections of a surrounding grid/lattice.

For computing similarity at all grid points, the mutual distances between the probe atom and the atoms of the molecules in the aligned dataset are also taken into account. To describe this distance dependence and calculate the molecular properties, Gaussian type functions are employed. Since the underlying Gaussian type functional forms are ‘smooth’ with no singularities their slopes are not as steep as the Columbic and Lennerd Jones potentials in CoMFA; therefore no arbitrary cut off limits are required to be defined.

Comparison between CoMFA and CoMSIA CoMFA CoMSIA Function type Lennerd-Jones potential, Coulomb potential Gaussian Descriptors Interaction energies Similarity indices Cut-off required No t required Field Steric, electrostatic Steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor Contour map Often not contiguous Contiguous Model reproducibility Poor Good *CoMSIA is Provide By TRIPOS Inc. in the Sybyl Software, along with CoMFA.

Statistical methods used in QSAR Linear Regression Analysis (RA) Multivariate Data Analysis Pattern Recognition Simple Linear regression Principal component analysis (PCA) Cluster analysis Multiple Linear regression (MLR) Principal component regression (PCR) Artificial neural networks (ANNs) Stepwise multiple linear regression Partial least square analysis (PLS) k-nearest neighbor (kNN) Genetic function approximation (GFA) Genetic partial least squares (G/PLS)

Linear Regression Analysis (LRA) Linear Regression Analyses are considered as an easily interpretable methods indicate for QSAR analysis. These techniques construct a statistical model to represent the correlation of one or more independent variables(x) with a dependent explicative variable (y). The model can be utilized to predict y from the knowledge of x variables, either quantitative or qualitative.

Simple Linear regression method Standard linear regression calculation to generate a set of QSAR equations that include a single independent descriptor x and dependent variable y. A one term linear equation is produced separately for each independent variable from the descriptor set.  

b. Multiple Linear regression (MLR ) Referred as linear free energy relationship (LFER) method. Generates QSAR equations by performing standard multivariable regression calculations to identify the dependence of a drug property or any all of the descriptors under investigation. Involves more than one variables.  

c. Stepwise multiple linear regression Commonly used variant of MLR. Creates multiple term linear equation but not all the independent variables are used. Each independent variable is sequentially added to the equation and new regression is performed every time. The new term is preserved only if the model passes a test for significance. Is useful when the number of descriptors are large and key descriptor is unknown.

Multivariate Data Analysis It replaced LRA. It tried to explain an extended set of variables by means of a reduced number of new latent variables possessing the maximum amount of information relevant to the problem.

a. Principal component analysis (PCA ) Data reduction technique that does not generate QSAR model. Creates new set of orthogonal descriptors i.e. principal components which describe most of the information contained in the independent variables. Reduces dimensionality of a multivariate data set of descriptors to the actual amount of data available.

b. Principal component regression (PCR ) When principal components are employed as the independent variables to perform a linear regression, the method is termed as Principal component regression. PCR applies scores from PCA decomposition as regressors in the QSAR model, to generate multiple term linear equation.

c. Partial least square analysis (PLS ) An iterative regression procedure that produces its solution based on linear transformation of large number of original descriptors to a small number of new orthogonal terms called latent variables. PLS is able to analyze complex SAR data in a more realistic way. Is able to interpret the influence of molecular structure on biological activity

d. Genetic function approximation (GFA ) Serves as an alternative to standard regression analysis for building QSAR equations. Employs natural principles of evolution of species which leads to improvements by recombination Suitable for obtaining QSAR equations when dealing with a larger number of independent variables. Results in multiple models generated by initial models using genetic algorithms.

e. Genetic partial least squares (G/PLS ) It is valuable analytical tool that has evolved by combining the best features of GFA and PLS.

Pattern Recognition The method is based on the principal of analogy. The method is used for the detection of the distance or closeness within the large amount of multivariate data.

a. Cluster analysis Statistical pattern recognition method used to investigate the relationship between observations associated with several properties and to partition the data set into categories consisting of similar elements. Allows for the consideration of the inactive compounds in the analysis. Can be used to study a large set of substituents to identify subsets which share similar physical properties.

b. Artificial neural networks (ANNs ) The technique has its origin from the real neurons present in an animal brain. Are parallel computational systems consists of groups of highly interconnected processing elements called neurons, which are arranged in a series of layers. a)First layer: Input layer b)Subsequent layer: Hidden layer c) Last layer: Output layer

Contd. Each layer does its independent computations and pass the results to another one. The weighed inputs are summed up and supplied to the hidden layers, where a non linear transfer function does all the required processing. The results of transfer function are communicated to the neurons in the output layer, where the results are interpreted and finally presented to the users.

c. k-nearest neighbor ( kNN ) Simplest machine learning algorithms. Most commonly used for classifying a new pattern (e.g. a molecule) Technique is based on a simple distance learning approach. Where unknown/ new molecules are classified according to the majority of its k-nearest neighbors in the training set. The nearness is determined by Euclidean distance metric (e.g. similarity measure computed using the structural descriptors of the molecules).

Importance of statistical parameter Equations generated/established in QSAR studies are Linear Regression equations. A number of equations may be generated/established for one problem/case under study. Statistics helps in selecting one suitable bet fit equation out of them.

Contd. This may be done by checking standard deviation/variance an other related parameters for the data set used for QSAR studies . Correlation coefficient computed for the data set under study also helps in selecting appropriate QSAR equation.

Reference: Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. N ew York, VCH Publishers, 1993. p. 4-8 Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 27-54 Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 57- 68 Kubinyi H ., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993 . p. 159