How to build a better molecule - NEQUIMED/IQSC/USP

5
3/29/17 1 How to build a better molecule Imagine: 1. There seems to be a relationship between number of carbon atoms in a molecule and its melting point. 2. A simple mathematical model could be: M = kN + c Where k=slope and c = intercept 3. What values of k and c give the smallest error between the calculated melting points and the real ones? 4. Calculate by LINEAR REGRESSION 5. Now can be applied to predict the melting point of a new molecule. Machine learning Number of carbon atoms (N) Melting point (M) Process: 1. DESIGN a model 2. TRAIN the model 3. TEST the model k c

Transcript of How to build a better molecule - NEQUIMED/IQSC/USP

Page 1: How to build a better molecule - NEQUIMED/IQSC/USP

3/29/17

1

Howtobuildabettermolecule

Imagine:

1. Thereseemstobearelationshipbetweennumberofcarbonatomsinamoleculeanditsmeltingpoint.

2. Asimplemathematicalmodelcouldbe:M=kN +cWherek=slopeandc=intercept

3. Whatvaluesofkandcgivethesmallesterrorbetweenthecalculatedmeltingpointsandtherealones?

4. CalculatebyLINEARREGRESSION

5. Nowcanbeappliedtopredictthemeltingpointofanewmolecule.

Machinelearning

Numberofcarbonatoms(N)

Meltin

gpo

int(M)

Process:1. DESIGNamodel2. TRAINthemodel3. TESTthemodel

k

c

Page 2: How to build a better molecule - NEQUIMED/IQSC/USP

3/29/17

2

Similarly:

Biologicalactivity=f(propertiesofthemolecule)

Biologicalactivity=k1*descriptor1+k2*descriptor2+k3*descriptor3….+const.

QuantitativeStructure-ActivityRelationships

Descriptors:Variouspropertiesofthemolecule:• Lipophilicity• NumberofH-bonddonors• Molecularweight• Electronicparametersofsubstituents• Etc…

1. ConstructaQSARmodelusingchosendescriptors2. Optimise parameters(k1,k2,etc.)byMLR (orsimilar)onthetrainingset3. Checkperformanceusingaseparatetestset.

0-Ddescriptors:o Molecularweighto LogP

1-Ddescriptors:o Fingerprints

2-Ddescriptors:o Graph-related(numberofrings,numberofrotatablebonds)

3-Ddescriptors:o Sizeo Shapeo Surfacearea

Substituentdescriptors:o ElectronegativityofsubstituentX

QSAR:Moreondescriptors

Page 3: How to build a better molecule - NEQUIMED/IQSC/USP

3/29/17

3

Free-WilsonAnalysis

FreeandWilson:substituenteffectsontheactivityofaclassofmolecules

Free-WilsonAnalysis

Page 4: How to build a better molecule - NEQUIMED/IQSC/USP

3/29/17

4

3D-QSAR(CoMFA,CoMSIA)

Ateverygridpoint,calculatethesteric andelectrostatic energybetweenthemoleculeandaprobeatom(e.g.aCH3 carbocation)

3D-QSAR(CoMFA,CoMSIA)

(abitlikeFree-Wilsonanalysis)

100100

100 100

Page 5: How to build a better molecule - NEQUIMED/IQSC/USP

3/29/17

5

3D-QSAR(CoMFA,CoMSIA)

Choietal.,Bioorganic&MedicinalChemistryLetters,(2013),23,4540-4546

QSAR:Summary

1. QSARandrelatedmethodsaimtoextractthemaximumamountofinformationfromthedataavailable.• Themoredatayouhave,themorelikelyyouwillgetusefulinformation.

2. QSARmethodsrelyonamodel – thatthereissomeconsistentrelationshipbetweenthemeasured/predictedcharacteristicsofthemoleculesandtheiractivitythatcanbeexpressedmathematically.• Doyouknowallthemoleculeshavethesamemechanismofaction?

3. Modelvalidationdependsonasoundstatisticalanalysisofperformance,andwell-balancedtrainingandtestdatasets.• Hasthemodelbeentestedagainstatrulyindependentdataset?