Roles Conceptualization, Writing – original draft, Writing – review & editing * E-mail: preeti@dbeb.iitd.ac.in, preetisrivastava@hotmail.com Affiliation Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology, New Delhi, India
Transcription of a gene can be regulated at many different levels. One such fundamental level is interaction between protein and DNA. Protein(s) binds to distinct sites on the DNA, which activate, enhance or repress transcription. Despite being such an important process, very few tools exist to identify the proteins that interact with chromosome, most of which are in vitro in nature. Here, we propose an in vivo based method for identification of DNA binding protein(s) in bacteria where the DNA-protein complex formed in vivo is crosslinked by formaldehyde. This complex is further isolated and the bound proteins are identified. The method was used to isolate promoter DNA binding proteins, which bind and regulate the dsz operon in Gordonia sp. IITR 100 responsible for biodesulfurization of organosulfurs. The promoter binding proteins were identified by MALDI ToF MS/MS and the binding was confirmed by gel shift assay. Unlike other reported in vivo methods, this improved method does not require sequence of the whole genome or a chip and can be scaled up to improve yields.
Citation: Murarka P, Srivastava P (2018) An improved method for the isolation and identification of unknown proteins that bind to known DNA sequences by affinity capture and mass spectrometry. PLoS ONE 13(8): e0202602. https://doi.org/10.1371/journal.pone.0202602
Editor: Sabato D'Auria, Consiglio Nazionale delle Ricerche, ITALY
Received: May 5, 2018; Accepted: August 5, 2018; Published: August 23, 2018
Copyright: © 2018 Murarka, Srivastava. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information file.
Funding: The work was supported by the Department of Biotechnology, Government of India Innovative Young Biotechnologist award (DBT-IYBA) grant to PS. PM would like to thank CSIR for financial support. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
DNA binding proteins play a crucial role in transcription, DNA replication, repair, recombination and various other cellular activities [1]. To completely understand these fundamental biological processes, it is important to have knowledge about the proteins whose interplay leads to the complex control [2]. In the past, methods such as electrophoretic mobility shift assay (EMSA) [3, 4], pull down assay etc [5] have been reported to identify DNA binding proteins. These methods are mainly associated with in vitro DNA-protein interactions wherein a purified protein or a crude extract of protein is incubated with labeled DNA. DNA-binding proteins have specific or general affinity for DNA sequences. Some proteins involved in transcriptional regulation or other varied function may bind weakly to the DNA, hence making their isolation and identification difficult. For example, in gel shift assay, the conditions need to be optimized to get an optimal binding [6]. Similarly, in an in vitro pull-down assay the DNA fragment is immobilized and the crude extract is allowed to pass through. Thus, the binding conditions in vitro may differ with the binding conditions in vivo. Hence, isolation of all DNA binding proteins is not possible. Another widely used method for identifying DNA binding protein is chromatin immunoprecipitation (ChIP) assay [7, 8]. ChIP assay is used for genome wide profiling of DNA-binding proteins, histone modification or nucleosomes. Besides cost and availability, ChIP has its own technical limitations. This method requires antibodies specific to the protein of interest and therefore, it cannot be used when the proteins are not known [8]. Also, both ChIP and EMSA have low throughput, which makes these two techniques not suitable for identification of unknown factors that bind to DNA [8].
A SILAC (stable isotope labeling by amino acid in cell culture) based method was described by Ong et al., 2002 [9] for proteome analysis which was further improvised by Mittler et al., 2008 [10] to study proteins that interact with DNA. It is an in vitro method in which cells are allowed to grow in presence of labeled amino acids. The protein extract of these cells is passed through column containing immobilized DNA fragment. The DNA fragment is designed in such a way that it consists of a restriction enzyme site which is used for elution of proteins. The peptides generated after in gel digestion of the eluted protein are identified by mass spectrometry in combination with isotope coded affinity tag (ICAT) technology. A major limitation with SILAC based method is that it requires auxotrophy of amino acids employed for labeling, therefore this method cannot be extensively used in all types of bacterial cultures [11]. The elution of proteins is based on restriction digestion, therefore designing of DNA fragment is little tricky. There is a possibility that the restriction site is not accessible to the enzyme for digestion if a large complex is bound to DNA due to steric hindrance.
Butala et al., proposed another in vivo method for identification of proteins [12]. The method involves cloning the DNA of interest along with lac operators flanked by ISceI endonuclease sites. The lacI fused to FLAG tag was used for isolating the DNA protein complex. The method is very useful but it requires cloning of the desired DNA in a special type of vector, which has ISceI sites and also requires expression from ISceI meganuclease.
Dejardin and Kingston [13] proposed a method called Proteomics of Isolated Chromatin segment (PICh) for identifying the proteins associated with telomeres. The major limitation of PICh is that it cannot be used for DNA sequences that are present in single copy or are low in number in the genome. Another drawback of PICh is that it requires large amount of starting material (several hundred litres of culture).
There are only limited reports of methods that can be used to isolate the proteins that bind in vivo with the knowledge of short sequences of DNA. One such example of a known short DNA fragment is the dsz promoter of desulfurizing organisms. The genes for biodesulfurization are present in the form of an operon under the control of this promoter. The sequence of this promoter has been identified but the proteins that bind to this promoter have not been reported.
In this study, we describe a method for isolation and identification of transcription factors that bind to DNA inside the cell. It involves cross-linking of the proteins to dsz promoter DNA with formaldehyde when the cells are present in the log phase followed by harvesting and sonication of the cells. The promoter DNA is then subjected to digestion with exonuclease to generate overhangs, which further binds to specific biotinylated primer attached to streptavidin beads. Bound proteins are eluted based on pH. The eluates from the column are run on SDS-PAGE gel, and the protein bands, which are observed, are in gel digested and analyzed by MS/MS for identification. By this method we were able to isolate and identify transcriptosome complex consisting of 7 proteins whose interaction was evident by the interactome analysis.
Gordonia sp. IITR100 (MCC2877) [14] was cultured in minimal media. The composition of Minimal salt medium per litre was: Na2HPO4 (2.0 g), KH2PO4 (1.0 g), ammonium oxalate (4.25 g), MgCl2 (0.4 g) and sucrose (50 mM). Trace elements composition for 1 L was: KI (0.05 g), LiCl (0.05 g), MnCl2.4H2O (0.8 g), H3BO3 (0.5 g), ZnCl2 (0.1 g), CoCl2.6H2O (0.1 g), NiCl2.6H2O (0.1 g), BaCl2 (0.05 g), (NH4)6 Mo7O24. 2H2O (0.05 g), SnCl2.2H2O (0.5 g), Al (OH)3 (0.1 g). All the chemicals were dissolved in double distilled water. The medium was supplemented with 3 mM sodium sulfate as sulfur source. Single colony is used to inoculate 50 ml medium and incubated at 30°C and 180 rpm for about four days after which the OD600 was 1.
For cloning and expression studies, Escherichia coli (E. coli) DH5α and BL21 (DE3) and BL21 (DE3) pLysS cells were used respectively. E. coli was inoculated in LB media, incubated at 37°C at 180 rpm and the antibiotic used was kanamycin in the concentration of 50 μg/ml where required.
For crosslinking of DNA protein complex, formaldehyde was used. Three different concentrations (0.5, 0.75 and 1%) of formaldehyde were used. After incubating at 25°C for 10 min, formaldehyde was quenched by addition of Tris (final concentration 250 mM, pH 8) and incubated at RT (25°C) for 10 min. As a control experiment, in one set of culture, the cross-linking was reversed by addition of Rapigest, 0.5 M β-mercaptoethanol and Tris (250 mM, pH 8.8), incubated at 99°C for 25 min. Four different concentrations of Rapigest 0.5, 0.75, 1 and 2% were used.
Cells from both the cultures (cross-linked and reversal of cross-linked) were harvested at 3,500 rpm for 10 min at 4°C. Supernatant was discarded and the pellet was resuspended in 1 ml sonication buffer which consisted of 20 mM Tris-Cl, 150 mM NaCl, 0.1mM PMSF, 1mM DTT, 0.1mM EDTA and 10% glycerol, pH 7.5. Sonication was performed on Q125 sonicator (Qsonica sonicators, USA) by giving 20 sec bursts and 30 sec rest for 5 min at an amplitude of 30%. After sonication, the cellular debris is removed by centrifugation at 12,000 rpm for 20 min at 4°C. The supernatant was collected in a fresh eppendorf tube and used for further experiments. The remaining was stored in -20°C.
Supernatant (88 μl) prepared above was used for performing digestion with T5 exonuclease. T5 exonuclease, 2 μl (0.2unit/μl) enzyme was added to the above supernatant and incubated at 37°C for 45 min. The activity of the enzyme was stopped by EDTA (final concentration 10mM). A control experiment was set without treatment of supernatant with T5 exonuclease. This mixture of about 100 μl DNA-protein complex is mixed with streptavidin solution containing biotinylated oligonucleotide and incubated for 30 min in a rotating wheel in cold room. To attach the biotinylated oligonucleotide to the streptavidin beads, 5’ biotin labeled PS19 primer (5 μM) ( 5’ CAGTCATATGCGCGTATGTGTCCTCTAACCGTAAATAGCG 3’ ) was attached to 200 μl streptavidin solution in a microfuge tube and incubated in a rotating wheel for 10 min at room temperature. The primer used is 40 bp in length and specific to the promoter of our interest. As a control beads without bound oligonucleotides were used.
The mixture was centrifuged at 12,000 rpm for 2 min. at 4°C and the supernatant was collected. The supernatant should contain unbound proteins. The DNA protein complex bound to streptavidin beads was washed with 200 μl binding buffer three to four times by repeated incubations in a rotating wheel for 5 min. each in the cold room. The binding buffer consisted of 12% glycerol, 12 mM HEPES (pH 7.9), 4 mM Tris-Cl (pH 7.9), 60 mM potassium chloride, 1 mM EDTA, 1 mM DTT. The bound proteins were eluted by elution buffer and four fractions each of 200 μl were collected. The elution buffer consisted of 12% glycerol, 20 mM Tris-Cl (pH 5.8), 1 M potassium chloride, 5 mM magnesium chloride, 1 mM EDTA, 1 mM DTT, 20 μg/ml BSA.
The eluates obtained in pull down assay were analyzed on 12% SDS-PAGE gel with 5% stacking gel [15]. The gel was silver stained and the different bands obtained in the elution lane of the gel were cut and in-gel digested with trypsin following the protocol described by Bruker Daltonics, Bremen, Germany adapted from Shevchenko et al., 1996 [16]. The digested peptides were dried in speed vac and the samples were analyzed by MS/MS on ABI Sciex 5800 TOF/TOF system, USA.
The parameters set for the MS/MS analysis were as follows: Fixed modifications: Carbamidomethyl (C), Variable modifications: Deamidated (NQ), Oxidation (M), Peptide Mass Tolerance: ± 150 ppm.
To determine the specificity of the method, a non-specific DNA, biotinylated primer of kanamycin promoter was used. The remaining procedure was identical as was performed with primer of dsz promoter. The elution fractions were analyzed on an SDS-PAGE and identified by MALDI-ToF MS/MS analysis.
The methods followed for cloning were according to protocol mentioned in Sambrook and Russell [17]. The gene for XRE (Xenobiotic response element) family transcription regulator was cloned between the restriction sites NdeI and HindIII in the vector pET29a. This recombinant plasmid DNA was transformed to expression strain for studying overexpression of XRE. The restriction enzymes, T4 DNA ligase, Taq DNA polymerases were purchased from New England Biolabs, USA. IPTG and other chemicals used were of molecular grade.
The gene encoding for XRE like protein was amplified using primer pairs 5’ CTAGCATATGGTGAGCGAGCAACGCCGAATCGGGTACCAC 3’ (XRE-F) and 5’ CTAGAAGCTTCGTGGTCGTGCCATCGTCACCATCGCTGCG 3’ (XRE-R). The amplified PCR product was digested with NdeI and HindIII and cloned in pET29a vector between NdeI and HindIII. The vector so constructed was named pPM4. The recombinant plasmid was transformed to expression strains BL21 (DE3) and BL21 (DE3) plysS for overexpression of the desired protein. Single colony of the transformed cell was inoculated in LB media containing kanamycin (50 μg/ml). The cells were induced with IPTG (final concentration 1 mM) at an OD600 ~0.5, and the samples were collected at different time intervals (5 hrs and overnight after induction) and analyzed on a 12% SDS PAGE gel to check the expression.
To validate the binding of XRE protein to the dsz promoter, the protein was purified partially and EMSA was performed. For purification, a single colony of the BL21 (DE3) plysS cells harboring the plasmid containing the gene encoding for XRE protein was inoculated in 200 ml of LB media and induced for overexpression with 1mM IPTG. After 5 hrs of induction, the cells were harvested by centrifugation at 3500 rpm for 10 min. The cells were resuspended in 4 ml denaturing buffer A (100 mM Na2PO4, 10 mM Tris, 8 M urea, pH 8) and sonicated on Q125 sonicator (Qsonica sonicators, USA) by giving 30 sec bursts and 20 sec rest for 5 min at an amplitude of 25%. The cell debris was removed by centrifugation at 10,000 rpm for 15 min and the supernatant was collected in a fresh tube. The sonicated supernatant was added to the Ni-NTA column equilibrated with the buffer A. The suspension was incubated at 4°C on a rotating wheel for 2 hrs to allow the binding of the protein to Ni-NTA agarose resins. The flow through was collected and the column was washed with buffer B (100 mM Na2PO4, 10 mM Tris, 8 M urea, pH 6.5). The proteins bound to the column was eluted by elution buffer C and D (100 mM Na2PO4, 10 mM Tris, 8 M urea, pH 5.5 and 100 mM Na2PO4, 10 mM Tris, 8 M urea, pH 4.5 respectively). All the different collected fractions were run on 12% SDS PAGE gel to check the fraction which contains the protein of our interest. The elution fraction with buffer C (contained the XRE protein) was renatured by adding the eluate (5 ml) dropwise to the 50 ml renaturation buffer (10% glycerol, 25 mM Tris-Cl and 150 mM sodium chloride, pH 8) in cold room by stirring continuously with magnetic stirrer. The renatured protein solution was further dialyzed overnight against the renaturation buffer with 2–3 changes to remove the traces of urea. The renatured sample was run on 12% SDS PAGE gel to check the purity of the protein.
EMSA was performed to determine the binding of XRE with the dsz promoter. For this purpose, dsz promoter was Cy5 labeled by PCR amplification using Cy5 PS19 primer ( 5’ CAGTCATATGCGCGTATGTGTCCTCTAACCGTAAATAGCG 3’ ). Reaction mixture consisting of Tris binding buffer (20 mM Tris Cl, 5 mM MgCl2, 0.1 mM EDTA, 6% sucrose, 100 mM KCl and 1 mM DTT, pH 7.5), poly d(I-C), poly L-Lysine, 12 ng of labeled promoter and increasing concentration of partially purified protein (5–20 μg) was incubated for 4 hrs at 4°C and was then run on 6% native TBE gel for 2 hrs at 100V. The gel was then scanned in a fluorimager (GE typhoon 9000) to see the shift. Two types of controls were used. First contained all the components of the reaction mixture mentioned above expect the protein and the second reaction mixture consisted of tris binding buffer, poly d(I-C), poly L-Lysine, 12 ng of Cy5 labeled dsz promoter and 20 μg of renatured washing eluate from purification (it contains other non-specific proteins present in the partially purified eluate but the protein of our interest in negligible amounts (beyond detection limit)). The reaction mixture was incubated for 4 hrs at 4°C and was then run on 6% native TBE gel for 2 hrs at 100V. The gel was then scanned in a fluorimager (GE typhoon 9000) to see the shift.
To mimic in vivo conditions outside the cellular environment is complicated thereby making it difficult for all the DNA binding proteins bind to their respective DNA. Here, we propose an in vivo method for the isolation of the DNA binding proteins. Our improved method for isolation of DNA binding proteins is based upon the method originally proposed by Wang, 2009 [18], modified by Wu et al. [19]. The method differs in the following aspects. 1) We propose to use T5 exonuclease instead of exonuclease III. T5 exonuclease chews the DNA from 5’-3’ direction whereas exonuclease III chews from 3’-5’ direction. In any promoter DNA, regulatory elements bind to either the 3’ end or the upstream region or both. The extra sequences at the 5’ end can be taken for the isolation of all the promoter DNA binding proteins. In this way, it is likely that the user would not miss out any protein. [20], 2) biotinylated oligo is used for pairing with the DNA bound protein complex. 3) After elution, the eluted proteins were loaded on an SDS-PAGE followed by in gel digestion and analysis by MALDI-ToF MS/MS [21], [22]. 4) The method proposed worked well for bacterial cells and the DNA-protein complex is formed in vivo unlike the method described by Wu et al, 2011 where a PCR amplified DNA is used for DNA protein complex formation. 5) Microarray is not required and therefore the proposed method is cost effective and does not require the knowledge of whole genome sequence. Also, sufficient amount of protein can be obtained by packing an avidin agarose column and binding the DNA protein complex and further elution of the bound proteins. The method is schematically described in Fig 1.