RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access...

9
RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger proteinZif268 Shayoni Dutta, Yoshita Agrawal, Aditi Mishra, Jaspreet Kaur Dhanjal, Durai Sundar *From Joint 26th Genome Informatics Workshop and Asia Pacific Bioinformatics Network (APBioNet) 14th International Conference on Bioinformatics (GIW/InCoB2015) Tokyo, Japan. 9-11 September 2015 Abstract Background: Transcription factors, regulating the expression inventory of a cell, interact with its respective DNA subjugated by a specific recognition pattern, which if well exploited may ensure targeted genome engineering. The mostly widely studied transcription factors are zinc finger proteins that bind to its target DNA via direct and indirect recognition levels at the interaction interface. Exploiting the binding specificity and affinity of the interaction between the zinc fingers and the respective DNA can help in generating engineered zinc fingers for therapeutic applications. Experimental evidences lucidly substantiate the effect of indirect interaction like DNA deformation and desolvation kinetics, in empowering ZFPs to accomplish partial sequence specificity functioning around structural properties of DNA. Exploring the structure-function relationships of the existing zinc finger-DNA complexes at the indirect recognition level can aid in predicting the probable zinc fingers that could bind to any target DNA. Deformation energy, which defines the energy required to bend DNA from its native shape to its shape when bound to the ZFP, is an effect of indirect recognition mechanism. Water is treated as a co-reactant for unfurling the affinity studies in ZFP-DNA binding equilibria that takes into account the unavoidable change in hydration that occurs when these two solvated surfaces come into contact. Results: Aspects like desolvation and DNA deformation have been theoretically investigated based on simulations and free energy perturbation data revealing a consensus in correlating affinity and specificity as well as stability for ZFP-DNA interactions. Greater loss of water at the interaction interface of the DNA calls for binding with higher affinity, eventually distorting the DNA to a greater extent accounted by the change in major groove width and DNA tilt, stretch and rise. Conclusion: Most prediction algorithms for ZFPs do not account for water loss at the interface. The above findings may significantly affect these algorithms. Further the sequence dependent deformation in the DNA upon complexation with our prototype as well as preference of bases at the 2 nd and 3 rd position of the repeating triplet provide an absolutely new insight about the indirect interactions undergoing a change that have not been probed yet. Background Genome engineering is at its inception where genome editing tools need to: help design DNA templates of choice, construction of designer proteins to manipulate DNA, implementation, testing and debugging. The current pace of development unveils the promising applications of the genome targeting tools, if large scale reengineering of genomes are carried out [1]. Evaluating literature strength- ens the scope to exploit the new protein fold in ZFPs showcasing DNA binding affinity based on novel recogni- tion principals, holding the key to engineering novel Zinc fingers for targeted genome therapy. Fingers with different triplet specificity can be engineered by mutating the key amino acid residues hence enabling specificity in DNA recognition by ensuring a large number of combinatorial possibilities. Further, linking these modules or fingers as they function independently can ascertain the recognition * Correspondence: [email protected] Contributed equally Department of Biochemical Engineering and Biotechnology, DBT-AIST International Laboratory for Advanced Biomedicine (DAILAB), Indian Institute of Technology (IIT) Delhi, New Delhi 110016, India Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5 http://www.biomedcentral.com/1471-2164/16/S12/S5 © 2015 Dutta et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http:// creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/ zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Transcript of RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access...

Page 1: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

RESEARCH Open Access

A theoretical investigation of DNA dynamics anddesolvation kinetics for zinc finger proteinZif268Shayoni Dutta, Yoshita Agrawal, Aditi Mishra, Jaspreet Kaur Dhanjal, Durai Sundar*†

From Joint 26th Genome Informatics Workshop and Asia Pacific Bioinformatics Network (APBioNet) 14thInternational Conference on Bioinformatics (GIW/InCoB2015)Tokyo, Japan. 9-11 September 2015

Abstract

Background: Transcription factors, regulating the expression inventory of a cell, interact with its respective DNAsubjugated by a specific recognition pattern, which if well exploited may ensure targeted genome engineering.The mostly widely studied transcription factors are zinc finger proteins that bind to its target DNA via direct andindirect recognition levels at the interaction interface. Exploiting the binding specificity and affinity of theinteraction between the zinc fingers and the respective DNA can help in generating engineered zinc fingers fortherapeutic applications. Experimental evidences lucidly substantiate the effect of indirect interaction like DNAdeformation and desolvation kinetics, in empowering ZFPs to accomplish partial sequence specificity functioningaround structural properties of DNA. Exploring the structure-function relationships of the existing zinc finger-DNAcomplexes at the indirect recognition level can aid in predicting the probable zinc fingers that could bind to anytarget DNA. Deformation energy, which defines the energy required to bend DNA from its native shape to itsshape when bound to the ZFP, is an effect of indirect recognition mechanism. Water is treated as a co-reactant forunfurling the affinity studies in ZFP-DNA binding equilibria that takes into account the unavoidable change inhydration that occurs when these two solvated surfaces come into contact.

Results: Aspects like desolvation and DNA deformation have been theoretically investigated based on simulations andfree energy perturbation data revealing a consensus in correlating affinity and specificity as well as stability for ZFP-DNAinteractions. Greater loss of water at the interaction interface of the DNA calls for binding with higher affinity, eventuallydistorting the DNA to a greater extent accounted by the change in major groove width and DNA tilt, stretch and rise.

Conclusion: Most prediction algorithms for ZFPs do not account for water loss at the interface. The above findingsmay significantly affect these algorithms. Further the sequence dependent deformation in the DNA upon complexationwith our prototype as well as preference of bases at the 2nd and 3rd position of the repeating triplet provide anabsolutely new insight about the indirect interactions undergoing a change that have not been probed yet.

BackgroundGenome engineering is at its inception where genomeediting tools need to: help design DNA templates ofchoice, construction of designer proteins to manipulateDNA, implementation, testing and debugging. The currentpace of development unveils the promising applications of

the genome targeting tools, if large scale reengineering ofgenomes are carried out [1]. Evaluating literature strength-ens the scope to exploit the new protein fold in ZFPsshowcasing DNA binding affinity based on novel recogni-tion principals, holding the key to engineering novel Zincfingers for targeted genome therapy. Fingers with differenttriplet specificity can be engineered by mutating the keyamino acid residues hence enabling specificity in DNArecognition by ensuring a large number of combinatorialpossibilities. Further, linking these modules or fingers asthey function independently can ascertain the recognition

* Correspondence: [email protected]† Contributed equallyDepartment of Biochemical Engineering and Biotechnology, DBT-AISTInternational Laboratory for Advanced Biomedicine (DAILAB), Indian Instituteof Technology (IIT) Delhi, New Delhi 110016, India

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

© 2015 Dutta et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided theoriginal work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Page 2: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

of longer DNA stretches [2]. Understanding how DNAmolecules interact with ZFPs, critically adheres to theirstructure-function relationships. These relationships con-spicuously deal with conformational changes in DNA anddewetting at the interaction interface of ZFP-DNA, alle-viating paltry and meager aspects of affinity and specificityrespectively. Characterization of binding sites is bestinferred from recognition of sequence-specific contacts,mostly called direct recognition or direct readout. Thismechanism highlights the “recognition code” between thekey amino acid residues on the alpha-helix of ZFP and thenucleotide bases of the target DNA. Sequence dependencealone does not completely explain specificity in protein-DNA binding. Binding affinity gets afflicted by evenmutating bases not in direct contact with the protein resi-dues [3,4], implying that proteins employ modes otherthan direct recognition. DNA structural changes momen-tously affect its interactions with proteins [5]. Recognitionof DNA structural properties is referred to as indirectrecognition or indirect readout [6]. Governed by the bind-ing free energy of a protein-DNA interaction, some pro-teins bind more strongly to certain regions of the DNAthan the other regions [7]. Structural properties of DNAeffecting indirect readout by proteins include flexibility,elasticity, bending and kinking, major and minor groovewidths, and hydration [8-10]. The energy expended todeform DNA from its native conformation to the confor-mation in a protein-bound complex emphasizes on apotential recognition mechanism is the DNA deformationenergy [11,12]. We have run 180 ns molecular dynamicssimulations and reflected upon DNA contribution at theinteraction interface based on RMSD and stability of tra-jectory. Further fortified by evaluating structural propertiesof DNA like flexibility, bending and major groove widthchanges across the simulation to optimize our study forDNA bending upon binding to ZFP.Thermodynamic analyses of protein-DNA binding

reveal that water released from protein DNA interfacesfavors binding [13]. Structural analyses of the remainingwater at the protein-DNA complex interface illustratethat bulk of these water molecules advocate binding byscreening protein and DNA electrostatic repulsionsbetween electronegative atoms/like charges. Minor frac-tion of the observed interfacial waters form extendedhydrogen bonds between the protein and the DNA, act-ing as linkers compensating for the paucity of a directhydrogen bond (Figure 1) [14].The solvent moleculesequilibrate more easily and often around DNA thanaround the binding cavity of the protein. The degrees offreedom around the DNA are comparatively less thanthat of a protein-DNA complex. Hence the calculationof absolute solvation free energies is a more amenableproblem than predicting binding free energies. So wehave inferred from free energy perturbation data that

with increasing binding affinity the desolvation energy isindicative of a stable system vouching to reach itsenergy minima. A study encompassing DNA-ZFP affi-nity and binding affected by desolvation and change inDNA conformation, without compromising on stabilityof respective complex as well as strong correlation withthe type of template DNA is what this investigationentails.

Computational methodsStarting structures, models and docking studiesA sample set of eight target 9 bp DNA templates of thetype 5’ GNN-GNN-GNN 3’, flanked by TAT-GTT-TAT, negative control in binding studies with Zif268were used [15].These targets were docked using HAD-DOCK to the wild type sequence of Zif268(PDB ID:1AAY).Ideally all 16 GNN-GNN-GNN targets have beenanalyzed. Since literature reveals high affinity for GCrich sequences over AT rich ones we choose only asample set of 8 target DNA. The repeating triplet inthese targets is 5’ GNN 3’ whose 2nd (N) and 3rd (N)position preference for the 4 bases have been deter-mined. This study enables understanding of preferencesbetween G or/and C at 2nd and 3rd position of therepeating triplet or A and/or T for the same positions.The sample set is given in table 1.The 3DNA standalone software, was used to generate

the 6 GNN-GNN-GNN DNA templates, which has adirectory containing repeating units for each type of the55 fibers DNA and RNA structures [16,17]. This featureallows the user to model DNA structured from just itsnucleotide sequence. On choosing this mode, the user isasked to input the base sequence in the form of a datafile (complete sequence) or from keyboard (only therepeating sequence). The options -a|-b|-c|-d|-z can beused for A-DNA, B-DNA, C-DNA, D-DNA and Z-DNAmodels respectively [18].

Figure 1 A schematic representation of DNA-ZFP complexationand release of interfacial solvent molecules aiding strength ofbinding.

Table 1. Sample set of eight 9 bp DNA targets

Sample target DNA sequences :5’ GNN GNNGNN 3’

GGG GGGGGG GAA GAAGAA

GGC GGCGGC GAT GATGAT

GCC GCCGCC GTT GTTGTT

GCG GCGGCG GTA GTAGTA

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 2 of 9

Page 3: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

The HADDOCK software algorithm which is a data-driven approach to docking, utilizes distance constraintsextracted from experimental data (gathered from variouspossible sources, such as NMR, conservation data, etc.),to reconstruct and refine the protein-DNA complex[19]. The DNA PDB files generated from 3DNA had tobe converted to haddock-compatible format by remov-ing the Chain-IDs and SegIDs. Failing to do so led thesoftware to misinterpret the PDB files, leading to arbi-trary loss of secondary structure. Restraint files weregenerated based on the interaction interface. Active resi-dues, those involved in direct readout and passive resi-dues, involving the neighboring off target sites weredefined. Number of structure for rigid body docking(it0) was set from 1000 to 750 and for refinement (it1)from 200 to 100 (rate determining step). This was justi-fied as the structure of Zif268 was extracted from itsalready complexed state with its consensus DNA andhence it was assumed to be close to the confirmation itwould attain when docked with the new DNA. Solvatedrigid body docking was not used, as the effect of solventwas determined using free energy perturbation.

Molecular Dynamics simulation procedureThe GPU accelerated Amber molecular dynamics suitewith Amber FF03force field was used to perform allatoms explicit molecular dynamics simulations (MDsimulations) of protein-DNA complexes obtained upondocking http://ambermd.org/#Amber12 [20-22]. TheFF03 force field includes the Barcelona modification(force field pmbsc0) for nucleotide sequences mostlyDNA [23] in combination with the amber all atom forcefield parameters for the CaDA approach using an expli-cit solvent model [24], was used to define parametersfor docked protein-DNA complexes generated using theprogram HADDOCK. Since the pmbsc0 force fields big-gest success is its ability to drive structures from incor-rect to correct conformations, its integration with theFF03 force field will ensure conformational transitionsupon minimization to get the final refined structure[25]. Further the zinc finger protein-DNA complexescontaining Zn atoms were minimized using the “cationicdummy atom approach (CaDA)” which uses four identi-cal cationic dummy atoms to mimic zinc’s 4s4p3 vacantorbital’s which can adjust the lone-pair electrons of zinccoordinates, hereby simulating zinc’s propensity forfour-ligand coordination. The methods advantage lies inmaintaining zinc’s four ligand coordination in ZFPs inabsence of harmonic restraints rigidifying the zinc-containing active sites.Protein-DNA complex molecules were solvated with

TIP3P water model [26] in a cubic periodic boundarybox to generate required systems for MD simulationsand systems were neutralized using appropriate number

of counter ions. The distance between octahedron boxwall and protein complex was set to greater than 10Å toavoid direct interaction with its own periodic image.Neutralized system was then minimized, heated up to300 K temperature and equilibrated until the pressureand energies of systems were stabilized. Finally, equili-brated systems were used to run 30 ns long MD simula-tions for each. During the MD simulations, RMSD andH-bond fluctuations of DNA with protein were calcu-lated using VMD software [27]. All simulation studieswere performed on Intel Core 2 Duo CPU @ 3 GHz ofHP origin with 1 GBDDR RAM and DELL T3600 work-station with 8 GB DDR RAM and NVIDIA GeForceGTX TITAN 6 GB GDDR5 Graphics Card.

Procedure to evaluate DNA deformation uponcomplexationTo evaluate the DNA deformation upon binding toZif268 for each DNA template, 3DNA software wasused to identify helical parameters of the DNA templateupon docking versus its conformational change uponstabilization due to MD simulation. The change inmajor groove width and tilt before and after complexa-tion were evaluated using Perl scripts.

Free Energy Perturbation methodFor desolvation kinetics, the absolute or total free energycalculations that is used for computing the absolute sol-vation free energies by annihilating a whole solvent freemolecule, was calculated using the OPLS_2005 all-atomforce field with explicit solvent and were run with thedefault parameters in the Maestro version 9.4, interfaceto Desmond [28]. Using OPLS_2005-AA the intermole-cular interaction energy between molecules a and b isgiven by the sum of interactions between the sites on thetwo molecules as represented by the following equation,

�Eab =on a∑

i

on b∑

j

(qiqje2/rij + Aij/r12ij − Cij/r6ij)

The non-bonded contribution to the intramolecularenergy is also computed using the same expression for allpairs of sites separated by more than three bonds [29].The docked complexes were solvated in an orthor-

hombic water box using a 10 Å buffer with no ions. Allthe simulations were run with the TIP3P water modelwith default parameters implemented at our in-houseMultisim Facility. Since the complex contains our targetDNA and protein and the protein is fixed, an absolutefree energy calculation was performed. Protein in sol-vent and protein in vacuum was kept constant and thefinal energy of desolvation for the DNA was calculated.All the desolvation energies for the sample targetsobtained are relative values and this method has been

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 3 of 9

Page 4: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

optimized keeping time and computation constraints inmind.

Results and discussionThough literature studies show high binding affinity forGC rich sequences in case of zinc finger proteins, stu-dies uncovering the indirect interaction dynamics likestability in terms of DNA deformation and desolvationenergy in this case haven’t been reported so far. Ourstudies reveal insights about the same.

Binding affinity determined by docking scores andrespective KD valuesLiterature review based on KD values show that the pro-totype Zif268 has a KD value 0.4 for5’ XXXGGGXXX 3’ target sequence, whereas a high

KD value of 25 for 5’ XXXGTAXXX 3’ target sequence.This implies that target 5’ XXXGGGXXX 3’ binds to thesame ZFP with higher binding affinity in comparison tothe target 5’ XXXGTAXXX 3’ which binds with loweraffinity to the same protein [30]. Docking experimentsfor the same targets were performed to check reliabilityof the docking scores. The same target 5’ GGGGGGGGG3’ has a high negative docking score -150.34 revealingvery high binding strength with respect to 5’ GTAG-TAGTA 3’ which has a less negative docking score of-125.75 revealing low binding affinity (Table 2). Thenegative binder which ideally does not bind to a ZFP 5’TATGTTTAT 3’ shows a docking score of -113.6whereas the wild type 5’ GCGTGGCGC 3’ shows a dock-ing score of -143. Thus the docking scores are reliablebased on experimental KD data.

Direct correlation between binding affinity and stabilityof complex determined by RMSD plotsThe trajectories for 180 ns simulations and respectiveRMSD versus total time taken for the simulation (innanoseconds) plots were generated. This was done toascertain the correlation between highly stable complexes

possessing higher binding energies and vice-versa. Stabi-lity of each sequence was analyzed by the RMSD plots,where, 5’ GGGGGGGGG 3’ target stabilizes at 4 Å (morestability) over a time scale of 30 ns simulation trajectoryas opposed to targets 5’ GATGATGAT 3’and 5’ GTAG-TAGTA 3’ stabilizing at 7 Å (less stable) over the sametimescale. The target 5’ GGGGGGGGG 3’ has mostnegative docking score with very high binding strengthand affinity complementing our simulation studies whichexhibit relatively high stability. Similarly 5’ GATGAT-GAT 3’and 5’ GTAGTAGTA 3’ have less negative dock-ing scores with low binding affinity and lower stabilitybased on simulation studies respectively. Hereby, directcorrelation between the affinity and stability of theRMSD graphs (Figure 2) for GC rich sequences over ATrich ones was established. Variation in hydrogen bondsfor each target DNA complexed with Zif268 plotted over30 ns also substantiates our above hypothesis. Hence,strong binders have more negative docking scores, higherstability (RMSD plots) and more retention of hydrogenbonds after simulation. Similarly weak binders have lessnegative docking scores, lower stability and less retentionof hydrogen bonds.

Indirect interactions of Zif268 with DNA targets of thetype 5’ GNN-GNN-GNN 3’ demonstrating the varyingbinding strengthTarget DNA sequence preference by Zif268 based onhydrogen bond retentionThe first nucleotide of the repeating triplets in our targetDNA being G, the analysis spreads to the 2nd and 3rd

nucleotide. It was observed that, if the 2nd and 3rd positionof the repeating triplet in our target DNA is GC rich thenit is a more stable complex as compared to a AT rich one.But an interesting observation was that, the 2nd and 3rd

position if dominated by G, shows maximum stability ifnot highest affinity, followed by C,A and T. Maximumnumbers of H-bonds are maintained throughout the simu-lation trajectory for target DNA sequences rich in G at 2nd

Table 2. Free energy perturbation and docking score data for our sample of 6 GNNGNNGNN target DNA bound toZif268 protein sequence

Target DNA sequence 5’-3’ dG Solvation (kcal/mol) Docking Score Protein sequence

GTTGTTGTT -1742.44 ± 49.88 -117.87 RER RHR RER

GTAGTAGTA -1817.5 ± 48.61 -125.75 RER RHR RER

GATGATGAT -1905.5 ± 48.79 -124.05 RER RHR RER

GAAGAAGAA -1952.35 ± 340.75 -114.05 RER RHR RER

GCCGCCGCC -5150.62 ± 137.57 -129.22 RER RHR RER

GGCGGCGGC -5156.8 ± 137.39 -131.01 RER RHR RER

GGGGGGGGG -5411.88 ± 141.45 -150.34 RER RHR RER

GCGGCGGCG -5460.13 ± 143.252 -134.4006 RER RHR RER

More negative the docking score stronger the binding, further the desolvation energy also enables to draw correlation that greater the binding affinity more lossof water is seen at the interface.

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 4 of 9

Page 5: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

and 3rd base position of the repeating triplet 5’ GN(2nd)N(3rd) 3’ followed by G at 2nd and C at 3rdposition, then byC at 2nd and 3rd position of the target DNA. It emphasizesagain on the heightened stability of theses complexes thanfor target DNA sequences rich in A at 2nd and 3rd position

of the repeating triplet5’ GN(2nd)N(3rd) 3’ followed by A at2nd and T at 3rdposition, then by T at 2nd and 3rd positionwhich form lesser number of H-bond (Figure 3). Hereby,sequential preference at the 2nd and 3rd position of therepeating triplet by Zif268 gets a new insight.

Figure 2 RMSD vs. time (30 ns) plot for all our target DNA sequences complexed to Zif268.

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 5 of 9

Page 6: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

Establishment of sequence-dependent DNA deformabilityThe sequence based DNA deformation at the interfaceupon binding with Zif268 has been analyzed based onthe change in DNA major groove width and helical tilt

around the interaction interface. DNA deformation forthe 2nd and 3rdbase position of the repeating triplet 5’GN (2nd) N (3rd) 3’ if dominated by G (the strongestbinder 5’ GGGGGGGGG 3’ based on negative docking

Figure 3 H-bond variation over simulation trajectory our entire target DNA sequences complexed to Zif268.

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 6 of 9

Page 7: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

scores, KD values and RMSD graphs) has maximumchange in major groove width, followed by C and A forthe same base position. If these base positions are domi-nated by T, least changes in the major groove width areseen (weakest binder 5’ GTTGTTGTT 3’) (Figure 4).Similarly the distortion in helical tilt for again the GCrich complexes is greater than the AT rich ones (Addi-tional File 1). Fortifying the aspect that greater thedeformation and conformational change in the DNAupon complexation, stronger the binding.Establishment of sequence-dependent DNA desolvationThe energy required to expel water from the DNAinterface upon complexation is also dependent on thetarget DNA sequence. The FEP values for G rich oreven GC rich targets, which are the strongest bindersare more negative (-5411.88 ± 141.459) revealinggreater solvent loss at the interface than compared to

that of the AT rich ones (-1742.44 ± 49.8897), theweakest binder. The FEP data (Table 2) shows thatthe 2nd and 3rdbase position of the repeating triplet 5’GN (2nd) N (3rd) 3’ if dominated by G experiencesgreater solvent loss upon complexation followed by Cand A. If these base positions are dominated by T,least solvent loss is seen at the interaction interface.Our desolvation kinetics data obtained from runningfree energy perturbation also corroborates theassumption in theory that greater the loss of bulk sol-vent at the interaction interface of ZFP-DNA com-plexation stronger the binding affinity and stability ofthe complex.Both DNA deformation and desolvation reveal data to

affirm greater deformation of DNA in case of morestable interactions followed by more negative energyneeded to expel water at these interfaces.

Figure 4 DNA deformation as a function of Binding strength. Parameters to evaluate conformational change in DNA like major groovewidth and helical tilt across the simulation trajectory for the weakest and strongest binder have been plotted.

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 7 of 9

Page 8: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

OutliersThough 5’ GAAGAAGAA 3’ has a less negative dockingscore of -114.69 and based on docking, should have beenthe weakest binder as compared to -117.87 of 5’GTTGTTGTT 3’ but the RMSD graphs generated uponsimulation show 5’ GAAGAAGAA3’ to be more stablethan 5’GTTGTTGTT 3’, even the desolvation energy fol-lows the same preference, confirming 5’ GTTGTTGTT3’ to be the weakest binder. But target 5’ GCCGCCGCC3’ does not quite obey our theoretical assumptions incase of binding affinity and stability (Table 2), though itobeys indirect interactions like desolvation and DNAdeformation (Additional File 1: Figure S1D). This obser-vation might imply the strong role of indirect factors inDNA-ZFP complexation.

ConclusionThe target DNA sequences which had strong bindingaffinity for Zif268 shows higher stability, greater reten-tion of hydrogen bonds, greater deformation of itsrespective DNA and higher solvent loss at the interac-tion interface. Conversely, the weak binders show lowerstability, lower retention of hydrogen bonds, lesser DNAdeformation and desolvation. The binding affinity, stabi-lity, DNA deformation and desolvation are sequencedependent. These parameters favor the 2nd and 3rdbaseposition of the repeating triplet 5’ GN (2nd) N (3rd) 3’dominated by G followed by C, A and T.The dynamics of water molecules in the binding affi-

nity of DNA-ZFP upon complexation has never wit-nessed an experimental platform and most of the toolsthat enable prediction of optimum ZFPs for our targetDNA have overlooked it. Such a finding with the pat-terns unveiled can revolutionize the way we look atZFPs for any target DNA and improve accuracy ofmany tools.

Additional material

Additional file 1: Figure S1: Interaction of Zif268 with the targetDNA sequences. DNA deformation is shown using various fourparameters: major groove width, tilt, rise and stretch. Parameters like riseand stretch have been analyzed along with the major groove width andhelical tilt. Stretch, rise and major groove width are translational helicalparameter where stretch is evaluated at the intra-bp level and rise at theinter-bp level whereas tilt is a rotational helical parameter at the inter-bplevel. A) 5’ GGG-GGG-GGG 3’ strongest binder: maximum distortion inmajor groove width and stretch followed by tilt and rise, B) 5’ GCC-GCC-GCC 3’ intermediate binder, C) 5’ GTT-GTT-GTT 3’ weakest binder, D) 5’GAA-GAA-GAA 3’ weak binder. The region between 10th and 20th bp ofthe target DNA sequences. This region show maximum variation for thestrong binders followed by the intermediate and weak binders for theparameters like tilt, stretch, rise and major groove width.

Competing interestsThe authors declare that they have no competing interests.

Authors’ contributionsSD and DS designed the methods and experimental setup. SD, YA and AMcarried out the implementation of the various methods. SD, YA, AM, JKDand DS wrote the manuscript. All authors have read and approved the finalmanuscript.

AcknowledgementsSD acknowledges the award of INSPIRE Scholarship from DST, Govt. of India.YA and AM were recipients of the Summer Undergraduate Research Award(SURA) from IIT Delhi. This study was made possible in part through thesupport of a grant from the Lady Tata Memorial Trust, Mumbai, DuPontYoung Professor Award and the Department of Biotechnology (DBT) underthe Bioscience Award Scheme to DS. Computations were performed at theBioinformatics Centre at IIT Delhi, supported by the DBT, Govt. of India.

DeclarationFunding for open access charges: IIT Delhi (IRD/RP00713 to D.S.)This article has been published as part of BMC Genomics Volume 16Supplement 12, 2015: Joint 26th Genome Informatics Workshop and 14thInternational Conference on Bioinformatics: Genomics. The full contents ofthe supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S12.

Published: 9 December 2015

References1. Carr PA, Church GM: Genome engineering. Nat Biotechnol 2009,

27(12):1151-1162.2. Isalan M, Klug A, Choo Y: Comprehensive DNA recognition through

concerted interactions from adjacent zinc fingers. Biochemistry 1998,37(35):12026-12033.

3. Szymczyna BR, Arrowsmith CH: DNA-binding specificity studies of 4 ETSproteins supports an” indirect read-out” mechanism of protein-DNArecognition. Journal of Biological Chemistry 2000.

4. Gromiha MM, Siebers JG, Selvaraj S, Kono H, Sarai A: Intermolecular andintramolecular readout mechanisms in protein-DNA recognition. Journalof molecular biology 2004, 337(2):285-294.

5. Baldi P, Lathrop R: DNA structure, protein-DNA interactions, and DNA-protein expression. Altman, R et al 2001, 101-102.

6. Aeling KA, Steffen NR, Johnson M, Wesley Hatfield G, Lathrop RH,Senear DF: DNA deformation energy as an indirect recognitionmechanism in protein-DNA interactions. IEEE/ACM Transactions onComputational Biology and Bioinformatics (TCBB) 2007, 4(1):117-125.

7. Steffen NR, Murphy S, Tolleri L, Hatfield GW, Lathrop RH: DNA sequenceand structure: direct and indirect recognition in protein-DNA binding.Bioinformatics 2002, 18(suppl 1):S22-S30.

8. Hogan M, Austin R: Importance of DNA stiffness in protein-DNA bindingspecificity 1987.

9. Gromiha MM: Influence of DNA stiffness in protein-DNA recognition.Journal of biotechnology 2005, 117(2):137-145.

10. Harrington RE, WiNicov I: New concepts in protein-DNA recognition:sequence-directed DNA bending and flexibility. Progress in nucleic acidresearch and molecular biology 1994, 47:195-270.

11. Li W, Nordenskiöld L, Zhou R, Mu Y: Conformation-dependent DNAattraction. Nanoscale 2014, 6(12):7085-7092.

12. Steffen NR, Murphy SD, Lathrop RH, Opel ML, Tolleri L, Hatfield GW: Therole of DNA deformation energy at individual base steps for theidentification of DNA-protein binding sites. GENOME INFORMATICS SERIES2002, 153-162.

13. Eggers DK, Castellano BM, Dharmaraj S: Calorimetric Determination ofDesolvation Energy for a Model Binding Reaction in Dilute and CrowdedSolutions. Biophysical Journal 2013, 104:576.

14. Jayaram B, Jain T: The role of water in protein-DNA recognition. Annu RevBiophys Biomol Struct 2004, 33:343-361.

15. Jamieson AC, Kim SH, Wells JA: In vitro selection of zinc fingers withaltered DNA-binding specificity. Biochemistry 1994, 33(19):5689-5695.

16. Chandrasekaran R, Radha A, Park HS, Arnott S: Structure of the beta-formof poly d(A).poly d(U). Journal of biomolecular structure & dynamics 1989,6(6):1203-1215.

17. Chandrasekaran R, Wang M, He RG, Puigjaner LC, Byler MA, Millane RP,Arnott S: A re-examination of the crystal structure of A-DNA using fiber

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 8 of 9

Page 9: RESEARCH Open Access A theoretical investigation of DNA … · 2017-04-10 · RESEARCH Open Access A theoretical investigation of DNA dynamics and desolvation kinetics for zinc finger

diffraction data. Journal of biomolecular structure & dynamics 1989,6(6):1189-1202.

18. Lu XJ, Olson WK: 3DNA: a versatile, integrated software system for theanalysis, rebuilding and visualization of three-dimensional nucleic-acidstructures. Nat Protoc 2008, 3(7):1213-1227.

19. De Vries SJ, van Dijk M, Bonvin AM: The HADDOCK web server for data-driven biomolecular docking. Nature protocols 2010, 5(5):883-897.

20. Gotz AW, Williamson MJ, Xu D, Poole D, Le Grand S, Walker RC: RoutineMicrosecond Molecular Dynamics Simulations with AMBER on GPUs. 1.Generalized Born. Journal of chemical theory and computation 2012,8(5):1542-1555.

21. Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C:Comparison of multiple Amber force fields and development ofimproved protein backbone parameters. Proteins 2006, 65(3):712-725.

22. Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A,Simmerling C, Wang B, Woods RJ: The Amber biomolecular simulationprograms. Journal of computational chemistry 2005, 26(16):1668-1688.

23. Pérez A, Marchán I, Svozil D, Sponer J, Cheatham Iii TE, Laughton CA,Orozco M: Refinement of the AMBER Force Field for Nucleic Acids:Improving the Description of α/γ Conformers. Biophysical Journal 2007,92(11):3817-3829.

24. PANG Y-P, XU K, YAZAL JE, PRENDERGAST FG: Successful moleculardynamics simulation of the zinc-bound farnesyltransferase using thecationic dummy atom approach. PRS 2000, 9(10):1857-1865.

25. Bhattacharya D, Cheng J: 3Drefine: Consistent protein structurerefinement by optimizing hydrogen bonding network and atomic-levelenergy minimization. Proteins: Structure, Function, and Bioinformatics 2013,81(1):119-131.

26. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML:Comparison of simple potential functions for simulating liquid water.The Journal of chemical physics 1983, 79:926.

27. Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA:PDB2PQR: expanding and upgrading automated preparation ofbiomolecular structures for molecular simulations. Nucleic acids research2007, 35(suppl 2):W522-W525.

28. Shivakumar D, Williams J, Wu Y, Damm W, Shelley J, Sherman W: Predictionof absolute solvation free energies using molecular dynamics freeenergy perturbation and the OPLS force field. Journal of chemical theoryand computation 2010, 6(5):1509-1519.

29. Jorgensen WL, Tirado-Rives J: The OPLS [optimized potentials for liquidsimulations] potential functions for proteins, energy minimizations forcrystals of cyclic peptides and crambin. Journal of the American ChemicalSociety 1988, 110(6):1657-1666.

30. Beerli RR, Segal DJ, Dreier B, Barbas CF: Toward controlling geneexpression at will: specific regulation of the erbB-2/HER-2 promoter byusing polydactyl zinc finger proteins constructed from modular buildingblocks. Proceedings of the National Academy of Sciences 1998,95(25):14628-14633.

doi:10.1186/1471-2164-16-S12-S5Cite this article as: Dutta et al.: A theoretical investigation of DNAdynamics and desolvation kinetics for zinc finger proteinZif268. BMCGenomics 2015 16(Suppl 12):S5.

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

Dutta et al. BMC Genomics 2015, 16(Suppl 12):S5http://www.biomedcentral.com/1471-2164/16/S12/S5

Page 9 of 9