Click me
Transcribed

Quantitative Structure Activity Relationship

On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR н-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y BP 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y N.DEINON ELICITER On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 4 6 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR H-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y ВР 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 2 3 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 6 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 5 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y ELICITER On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 4 6 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR H-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y ВР 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 2 3 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 6 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 5 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y ELICITER On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 4 6 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR H-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y ВР 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 2 3 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 6 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 5 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y ELICITER On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 4 6 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR H-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y ВР 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 2 3 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 6 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 5 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y ELICITER On the use of 'H and 13C NMR spectra as QSAR descriptors IMM E.L. Willighagen, R. Wehrens, and L.M.C. Buydens Institute for Molecules and Materials Radboud University Nijmegen [email protected] Introduction Results LogP Prediction The internal performance statistics R and Q for the three data sets show that 'H NMR does Quantitative Structure Activity Relationship (QSAR) models correlate molecular structures with biological and chemical activities. not yield acceptable models. While 13C NMR models are acceptable, Dragon descriptors are clearly better: Visual inspection of the ypredicted VS. Ymeasured plots confirms that Dragon-based models give more accurate predictions than C NMR-based models for both the training set (black) and the independent test set (red): Spectra have been suggested as descriptor of the molecular structures, but the performance of spectra-based QSAR models has not been thoroughly tested. ws ws LogP HNMR 6. otraining set test set This poster presents QSAR models based on 'H and 1C NMR spectra and compares this with models build from theoretical molecular H-NMR C-NMR Dragon H-NMR C-NMR Dragon BP BP descriptors. Dragon H-NMR C-NMR Dragon LogP LogP Data Sets 2 4 6 measured y Three data sets are discussed: H-NMR C-NMR Dragon C-NMR Dragon LogP CNMR H-NMR name # compounds activity WS This is confirmed by the root mean square errors for the LOO-CV (RMSECV) and for the ref. o training set • test set water solubility [1] [2] boiling point [3] 431 predictions of the test set (RMSEP). The horizontal line indicates the error of a yured = y ВР 277 LogP 154 LogP model; RMSE values should be well below this limit: For each data set 'H and 13C NMR spectra are WS RMSECV WS RMSEP simulated using ACD/Labs NMR Predictor. Theoretical molecular descriptors are calculated with Dragon and a subset is randomly chosen. All three descriptor sets contain 220 variables. 2 3 4 5 H-NMR Dragon H-NMR C-NMR Dragon measured y BP RMSECV BP RMSEP LogP Dragon o training set test set Methods H-NMR Dragon H-NMR C-NMR Dragon LogP RMSECV LogP RMSEP Partial Least Squares (PLS) was used to make the regression models. leave-one-out cross validation (LOO-CV) was used to pick the right 2. number of latent variables (LV's). H-NMR C-NMR Dragon H-NMR C-NMR Dragon The regression vector of the PLS models for 'H NMR shows much less structure than the 13C NMR. In blue are the +1 standard deviations of The vertical dotted line indicates the selected -1 4 5 6 number of latent variables. Whiskers indicate +1 standard deviation in the cross validation measured y the five models: error: LogP HNMR LogP HNMR HC-ONX Conclusions •'H NMR spectra do not yield good PLS regression models. है • 1®C NMR spectra yield acceptable PLS regression models, but are inferior to models based on theoretical molecular descriptors 11 5 Chemical Shift (ppm) LogP CNMR LogP CNMR C-ON I CH References whe [1]A. Yan and J. Gasteiger. Prediction of Aqueous Solubility of Organic Compounds based on a 3D Structure Representation. J.Chem.Inf.Comput.Sci., 43:429–434, 2003. [2] E.S. Goll and P.C. Jurs. Prediction of the Normal Boiling Points of Organic Compounds from Molecular Structures with a Computational Neural Network Model. J.Chem.Inf.Comput.Sci., 39:974-983, 1999. 10 20 220 200 180 160 140 120 100 80 60 LogP Dragon Chemical Shift (ppm) LogP Dragon [3] L.K. Schnackenberg and R.D. Beger. Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 1C NMR Spectra. J.Chem.Inf.Model., 45:360-365, 2005. 10 15 For each model type five randomly chosen independent test sets were used. Sorted Dragon Descriptors 0.8 0.9 1.0 1.1 1.2 0.15 -0.05 0.05 0.15 predicted y ELICITER

Quantitative Structure Activity Relationship

shared by dennison on Feb 01
422 views
1 share
0 comments
This Infographic represents Quantitative Structure Activity Relationship (QSAR) models.

Category

Science
Did you work on this visual? Claim credit!

Get a Quote

Embed Code

For hosted site:

Click the code to copy

For wordpress.com:

Click the code to copy
Customize size