Protein Active Site

Download as
 PPT
Presentation Description 

No description available

authorSTREAM Premium Service
What's up on authorSTREAM?
Views: 97
Like it  ( Likes) Dislike it  ( Dislikes)
Added: May 16, 2009 This Presentation is Public 
Presentation Category : Science & Technology All Rights Reserved
Presentation Transcript

Prediction of Catalytic residue through ANN :Prediction of Catalytic residue through ANN Brijesh Singh Yadav (Senior Research Associate) Sweta Gupta (Research Associate) United Research Center, UIT campus, Allahabad. e.Mail. brijeshbioinfo@gmail.com


AIM :AIM Develop a new method which help in identifying the surface chemistry of active site residue in a protein. This method help in ligand designing, molecular docking, de novo drug designing and structural identification, and comparison of functional site.


Introduction :Introduction Computer aided drug designing are two types Ligand designing Active site drug designing


About Neural Network :About Neural Network Neural network is a set of connected input/output units where each connection has a weight associated with it. Applications of Neural network are- speech recognition medical diagnosis image compression financial prediction


About Neural Network :About Neural Network Neural network method can be used for : classification, clustering, modelling and prediction of biological data. Neural networks can learn by two methods: Supervised learning Unsupervised learning


WORKING OF NEURAL NETWORK :WORKING OF NEURAL NETWORK å x 0 x 1 x n f Output Bias q j w 0j Weights w 1j w nj Inputs Weighted sum Activation function


How Does the Neural Network Learn :How Does the Neural Network Learn The network gets a training example and, using the existing weights in the network, it calculates the output. Backpropagation then calculates the error, by taking the difference between the calculated result and the expected (actual result). The error is fed back through the network and the weights are adjusted to minimize the error.


About Active Site of Proteins :About Active Site of Proteins Proteins are polymers of amino acids linked by peptide bonds. The region of a protein that interacts with a ligand is generally referred to as the “active site.” Ligands can be proteins, DNA or smaller molecules, such as pharmaceutical compounds. The active site generally lies on the surface of the protein. In some cases, the active site is buried within the protein. Residues with reactive groups (Asp, Glu, Ser, Cys, His, Lys, Arg) tend to be abundant in protein active sites. The Ser-His- Asp (sometimes Ser-His-Glu)“catalytic triad” is a motif commonly found in enzyme active site.


Levels of Protein Property :Levels of Protein Property Residue properties: which responsible for their function and structure • Polar or No polar • Aromatic or Aliphatic • Acidic or Basic • Charged ( either positive or negative) or uncharged • Contain Sulfur • Making H bonding • Essential or Nonessential • Cyclic • Achiral


Why predict protein function and structure? :Why predict protein function and structure? Protein's structural property helps to identify the various anomalies & diseases and rectify them at genetic level. Identifying the surface chemistry of ligand binding sites residues in a protein . Help in ligand designing, molecular docking, de novo drug designing and structural identification and comparison of functional sites.


Early methods :Early methods Statistical method Homology Modeling method Physio-chemical methods Evolutionary conservation Sequence patterns method


Approach to structure prediction :Approach to structure prediction Input is encoding as binary form in15 different property of catalytic or no catalytic triad.(3X15=45 X122)where 70 active and 52 nonactive site residues of protein. Create network training program using Backpropagation method where input, output, hidden layer, learning rate and epoch are fixed within the code. Output is one of two possible structure states: Active or nonactive.(1 0 or 0 1)


Network Design :Network Design


Methodology used :Methodology used Collect structural proteins containing active site residues from PDB. 2. Searching active site Residues through Ligplot. 3. Searching nonactive site Residues through Surface Racer. 4. Mapping of protein residues in binary digit with their 15 properties 5. Create a neural network (a computer program) 6. “Train” it by using proteins with known Active site and non active residues property . 7. Testing the network with unknown protein residues.


Methodology used :Methodology used


PDB data selected :PDB data selected We select about 100 protein from pdb . some example showing the below


Protein-ligand interaction showing by Ligplot :Protein-ligand interaction showing by Ligplot


Some protein with their active site residues :Some protein with their active site residues Protein Active site residue 1a4k Glu 81A Asn35B 1a5g Asp102 His57 Ser195 Gly216 1a42 His199 Gln137 Thr199 Glu205 1a46 Gly216 Lys375 1a50 His86 Ser227 Asn236 Gly230 1a94 Ala6c Asp29A


Amino Acid Encoding Scheme :Amino Acid Encoding Scheme Active site Residue (output 1 0) 1dih.pdb Arg,Gly,Thr 000001010010000,000010001010001,100000100101000 1ecv.pdb Arg,Gln,His 000001010010000,100010100010000,000001010010000 1fkb.pdb Asp,Glu,Ile 000010001010000,000010001010000,011000000100000 Non active site residues (output 0 1) 9rub.pdbARG,GLU,PHE000001010010000,000010001010000,010100000100000 8cpa.pdb ARG,ASN,LYS000001010010000,100000100010000,000001010100000 8atc.pdb ALA,ASN,GLu011000000010000,100000100010000,000010001010000


training & testing data Network :training & testing data Network Create a program using Matlab Function for the training of neural network. The program develops through Backpropagation method which contain the variable like train data, train output, all node, epoch, learning rate, and Error. A typical architecture is a fully-connected network (122 inputs,5hiddenlayer, 2 outputs). We train the network giving different value of learning rate and hidden layer when we obtain minimum error then stop the training. For the testing of result we also generate a program in Matlab.


Results and Data :Results and Data Performance of Training set- >>116/122)*100 Result- =95.0820% correct prediction Performance of Testing set- >> 38/40*100 Result=95.00% correct prediction Total no. of epoch- 100 Learning rate- .05 False positive- 2 out of 122 False negative-3 out of 122


Performance Measurement :Performance Measurement p = Number of correctly classified catalytic residues. n = Number of correctly classified non-catalytic residues. o = Number of non-catalytic residues incorrectly predicted to be catalytic (over-predictions). u = Number of catalytic residues incorrectly predicted to be non-catalytic (under-predictions). t = Total residues (p + n + o + u). The total error (Q Total) is given by equation Q Total =p + n / t x 100


Discussion and Conclusion :Discussion and Conclusion Neural network architecture developed predicts Active site structure of protein with a performance of almost 95% which is far above as reported so far. The analysis of the optimal subset selected from the initial 15 residue properties indicates that the algorithm learns to distinguish catalytic from non-catalytic residues based on structural &functional protein residues. This method help in ligand designing, molecular docking, de novo drug designing and structural identification and comparison of functional sites.


Reference- [1] - R. A. LASKOWSKI, N. M. LUSCOMBE, M. B. SWINDELLS and J.M.THORNTON Protein clefts in molecular recognition and function Protein Sci.1996 5: 2438-2452 [2] - Martin Stahl, Chiara Taroni and Gisber Schnei:Mapping of protein surface cavities and prediction of enzyme class by a selforganizing neural network . [3] - Bartlett GJ, Porter CT, Borkakoti NThornton JM., ]Analysis of catalytic residues in enzyme active sites. Department of Biochemistry and Molecular Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK. J Mol Biol. 2002 Nov 15;324(1):105-21[4]-Campbell SJ, Gold ND, Jackson RM, Westhead DR.: Ligand binding: functional site location, similarity and docking.School of Biochemistry and MolecularBiology, University of Leeds, Leeds, LS2 9JT, UK. Current Opinion Structrural Biology. 2003 Jun;13(3):389-95. :Reference- [1] - R. A. LASKOWSKI, N. M. LUSCOMBE, M. B. SWINDELLS and J.M.THORNTON Protein clefts in molecular recognition and function Protein Sci.1996 5: 2438-2452 [2] - Martin Stahl, Chiara Taroni and Gisber Schnei:Mapping of protein surface cavities and prediction of enzyme class by a selforganizing neural network . [3] - Bartlett GJ, Porter CT, Borkakoti NThornton JM., ]Analysis of catalytic residues in enzyme active sites. Department of Biochemistry and Molecular Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK. J Mol Biol. 2002 Nov 15;324(1):105-21[4]-Campbell SJ, Gold ND, Jackson RM, Westhead DR.: Ligand binding: functional site location, similarity and docking.School of Biochemistry and MolecularBiology, University of Leeds, Leeds, LS2 9JT, UK. Current Opinion Structrural Biology. 2003 Jun;13(3):389-95.


Acknowledgement :Acknowledgement I would like to express my sincere thanks to Dr. (Smt.) Navita Shrivastava, Head, Dept. of Computer Science, A.P.S. University, Rewa (MP) Mr.Pritish Kumar Varadwaj Lecturer Indian Institute of Information Technology Allahabad (UP) Mr.Rajeev Prithyani Lecturer Dept. of Computer Science, A.P.S. University, Rewa (MP) Mr.Sandeep Kushwaha Lecturer Dept. of Computer Science, A.P.S. University, Rewa (MP) for their kind supervision and keen interest during preparation of this project.