※ Documentation:


Frequently Asked Questions:

1. Q: From your example sequence "P53667", the iGPS predicts severl results below:
(1) S179 AGC/AKT SHGKRGLSVSIDPPH 2.011 1.21 String
(2) S298 AGC/AKT KPVLRSCSIDRSPGA 1.64   1.21 String
(3) S323 AGC/PKC KDLGRSESLRVVCRP 3.588 1.98 String
    How to interpret the results? For example, since the scores of (3) > (1) > (2), whether the predictions mean that the correct probability is (3) > (1) > (2)?

A: This is the mostly asked questions from users. There are two principles for interpreting the results. First, for the same protein kinase (PK) group, higher score means higher probability that a phosphorylation sites can be modified by the PK. Thus, since the score of (1) is greater than (2) for AGC/AKT, the (1) has higher probability to be a real hit than (2). Second, the scores from different PK groups can not be compared, because different training data sets and procedures were performed, and different thresholds were chosen. In this regard, comparison of scores of (3) to (1) and (2) is meaningless.

 

2. Q: How to use the iGPS software?

A: You can find the latest version of iGPS at http://igps.biocuckoo.org/down.php. Then download and install the iGPS software to your computer. Currently, iGPS is implemented in JAVA and could be installed on a computer with Windows/Linux/Unix/Mac OS. And we also wrote a manual for users which included in the installation package.

 

3. Q: Previously, You developed a kinase-specific predictor of GPS (Group-based Prediction System). why not directly use GPS to predict potentially site-specific kinase-substrate relations (ssKSRs)?

A: Well, the basic hypothesis behind GPS and similar tools is that short motifs around phosphorylation sites provide ENOUGH specificities for kinases recognition. It's largely correct for prediction of in vitro KSRs, but far from enough for in vivo KSRs. A number of contextual factors, such as kinase-substrate physical interaction, co-localization, co-expression, scaffold, provide additional specificity for in vivo kinase recognition. In this regard, in one of our protocols, we emphasized that the users MUST select a potential kinase (usually by blind guess) before running GPS software. But in this work, a major contextual filter of physical interaction between kinases and targets was considered. The aim of this study is provide a useful tool for prediction of in vivo KSRs.

 

4. Q: If I do not want to use iGPS, may I have any other options?

A: Yes. If you have only a couple of sequences, eg., 5~10 proteins, we recommend that you can choose either NetworKIN 1.0 or NetworKIN-2.0 beta version. But if you have hunderds or thousands of proteins (eg., you just carried out a large-scale detection of phosphoproteome in an X organism under Y condition), iGPS can help you to generate uesful information in a short time. We will be also grateful if you can tell us why iGPS can not satisfy your purpose. Your feedback will help us to further refine the software.

 

5. Q: I have a few questions which are not listed above, how can I contact the authors of iGPS?

A: Please contact the two major authors: Dr. Yu Xue and Dr. Jian Ren for details. The two authors are continuously maintain the technical implementation.

 

6. Q: I was trying to install the software in Mac OS but my installer says the file is damaged. How can I properly install the software in Mac OS?

A: By default, Mac OS 10.8 or later only allows users to install applications from 'verified sources'. In effect, this means that users are unable to install most applications downloaded from the internet. You can follow the directions below to prevent this error message from appearing.

(1) Open the Preferences. This can be done by either clicking on the System Preferences icon in the Dock or by going to Apple Menu > System Preferences.
(2) Open the Security & Privacy pane by clicking Security & Privacy.
(3) Make sure that the General section of the the Security & Privacy pane is selected. Click the icon labeled Click the lock to prevent further changes.
(4) Enter your username and password into the prompt that appears and click Unlock.
(5) Under the section labeled Allow applications downloaded from, select Anywhere. On the prompt that appears, click Allow From Anywhere.
(6) Exit System Preferences by clicking the red button in the upper left of the window. You should now be able to install applications downloaded from the internet.

 

Supplementary Data (Totally ~717MB)

All benchmark data, training data, testing data, predicted results are freely available for ALL users below. A "Readme" file is carefully prepared for each data/result set.

1. Benchmark sequences (47.31 MB):

  (1) Benchmark protein sequences in five species.
  (2) These protein sequences were downloaded from the UniProt (http://www.uniprot.org/) database on April 6, 2010.
  (3) Five organisms including Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens were considered.
  (4) To clear redundant sequences, we simply used CD-HIT (http://www.bioinformatics.org/cd-hit/) with the command: cd-hit -i input_sequence -o output_sequence -c 1.00 -n 5.
  (5) Files:
    (i) SC.fas: budding yeast proteins
    (ii) CE.fas: nematode proteins
    (iii) DM.fas: fruit fly proteins
    (iv) MM.fas: mouse proteins
    (v) HS.fas: human proteins
  (6) Last updated, April 6, 2010.

2. Protein-protein interaction data (16.50 MB): contains two parts

  A. Exp. PPI (experimental identified PPI)

  (1) Experimentally identified protein-protein interaction (PPI) data sets.
  (2) These PPI data were taken from several predominant public databases, including BioGRID, HPRD, DIP, MINT and IntAct.
  (3) Five organisms including Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens were considered.
  (4) Data format: The 1st column, Interactor A; The 2nd column, Interactor B; The 3rd column, Source of PPI pair.
  (5) All protein sequences in PPI data were mapped to the UniProt database.
  (6) Files:
    (i) SC.exp.ppi: budding yeast PPIs
    (ii) CE.exp.ppi: nematode PPIs
    (iii) DM.exp.ppi: fruit fly PPIs
    (iv) MM.exp.ppi: mouse PPIs
    (v) HS.exp.ppi: human PPIs
  (7) Last updated, April 10, 2010.

  B. STRING PPI

  (1) Pre-calculated protein-protein interaction (PPI) data sets.
  (2) These PPI data were taken from the STRING database (http://string.embl.de/).
  (3) All the protein sequences were mapped to the UniProt database by BLAST.
  (4) Five organisms including Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens were considered.
  (5) Data format: The 1st column, Interactor A; The 2nd column, Interactor B; The 3rd column, predicted score in STRING database.
  (6) Files:
    (i) SC.string.ppi: budding yeast PPIs
    (ii) CE.string.ppi: nematode PPIs
    (iii) DM.string.ppi: fruit fly PPIs
    (iv) MM.string.ppi: mouse PPIs
    (v) HS.string.ppi: human PPIs
  (7) Last updated, April 8, 2010.

3. Experimentally idenfied phosphorylation sites (60.59 MB):

  (1) Experimentally identified phosphorylation sites.
  (2) The experimental phosphorylation sites were taken from several major databases, including PhosphoPep, Phospho.ELM, SysPTM, PhosphoSitePlus and HPRD. The literature mining was also carried out to add additional phosphorylation sites.
  (3) All phosphorylated substrates were mapped to the UniProt database (http://www.uniprot.org/).
  (4) Five organisms including Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens were considered.
  (5) Redundant data were cleared.
  (6) Data format: the ist column, UniProt ID of phosphorylated protein; the 2nd column, protein sequence of phosphorylated protein; the 3rd column, phosphorylated position; the 4th column, phosphorylated residue type; the 5th column, source of phosphorylation sites. The data format was adopted from the Phospho.ELM database.
  (7) Files:
    (i) SC.elm: budding yeast phosphorylation sites
    (ii) CE.elm: nematode phosphorylation sites
    (iii) DM.elm: fruit fly phosphorylation sites
    (iv) MM.elm: mouse phosphorylation sites
    (v) HS.elm: human phosphorylation sites
  (8) Last updated, April 16, 2010.

4. Prediction results for five eukaryotic phosphoproteomes (4.91 MB):

  (1) Predicted site-specific kinase-substrate relationships in eukaryotes.
  (2) The total experimental phosphorylation sites were used as input (refer to Additional File 3).
  (3) The low threshold was used (refer to our GPS 2.0 article). Both of experimental and predicted PPI information was used.
  (4) Five organisms including Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens were considered.
  (5) Files:
    (i) SC.pnc: prediction results for budding yeast
    (ii) CE.pnc: prediction results for nematode
    (iii) DM.pnc: prediction results for fruit fly
    (iv) MM.pnc: prediction results for mouse
    (v) HS.pnc: prediction results for human
  (6) Last updated, Jun. 28, 2010.

5. Prediction results for the ATM/ATR-mediated DNA damage response process (0.42 MB):

  (1) Predicted site-specific kinase-substrate relationships for potential ATM/ATR substrates. (refer to the ATM_ATR.pnc file)
  (2) The data were taken from: Matsuoka et al., Science 316 (2007) 1160-1166. (refer to the ATM_ATR.elm file)
  (3) The low threshold was used (refer to our GPS 2.0 article). Both of experimental and predicted PPI information was used.
  (4) The experiment was carried out in Homo sapiens.
  (5) Files:
    (i) ATM_ATR.elm: experimentally identified phosphorylation substrates with their sites of ATM/ATR
    (ii) ATM_ATR.pnc: prediction results
    (iii) ATM_ATR.conserved.pnc: conserved site-specific kinase-substrate relationships in at least one other species
  (6) Last updated, Jul. 07, 2010.


6. Prediction results for liver phosphoproteomes (4.67 MB):

  (1) Predicted site-specific kinase-substrate relationships in livers and other related materials.
  (2) Two organisms including Homo sapiens and Mus musculus were considered.
  (3) Files:
    (i) Human_liver_phosphoproteome.xls: with high-throughput mass spectrometry, we experimentally identified liver phosphoproteome in human liver.
    (ii) HS.liver.elm: we mapped the human liver phosphoproteome to the UniProt database and made it into Phospho.ELM format.
    (iii) MM.liver.elm: the data set was taken from: Villen, et al., Proc Natl Acad Sci U S A 104 (2007) 1488-1493. All stated corrected phosphorylation sites were adopted for comparative analysis, while only highly confident sites were taken in the Additional File 3.
    (iv) HS.liver.pnc: predicted site-specific kinase-substrate relationships in human liver.
    (v) MM.liver.pnc: predicted site-specific kinase-substrate relationships in mouse liver.
    (vi) HS.liver.conserved.pnc: the conserved site-specific kinase-substrate relationships in human liver.
    (vii) MM.liver.conserved.pnc: the conserved site-specific kinase-substrate relationships in mouse liver.
  (4) The low threshold was used (refer to our GPS 2.0 article). Both of experimental and predicted PPI information was used.
  (5) Last updated, Jun. 30, 2010.

7. All spectra and search results (583.86 MB):

  (1) The search results (peptide list) and all associated spectra files (in image format).
  (2) The image files are in .png format and can be directly visualized.
  (3) The experiment was performed in Homo sapiens (Liver) by Ms. Chunxia Song (chunxiasong@dicp.ac.cn).
  (4) Directories:
    (i) MS2_MS3_1, MS2_MS3_2, and MS2_MS3_1: Mass spectra images files of the MS2/MS3 class for the phosphopeptides identified from human liver
    (ii) neuMS2: Mass spectra images files of the neuMS2 class for the phosphopeptides identified from human liver
    (iii) neuMS3: Mass spectra images files of the neuMS3 class for the phosphopeptides identified from human liver
    (iv) nonNertral: Mass spectra images files of the Non-Neutral class for the phosphopeptides identified from human liver
  (5) Files:
    (i) MS2_MS3_1.html, MS2_MS3_2. html, MS2_MS3_3. html: Html files with hyperlinks to the spectra images of the MS2/MS3 class for the phosphopeptides identified in human liver
    (ii) neuMS2.html: Html files with hyperlinks to the spectra images of the neuMS2 class for the phosphopeptides identified in human liver
    (iii) neuMS3.html: Html files with hyperlinks to the spectra images of the neuMS3 class for the phosphopeptides identified from human liver
    (iv) nonNertral.html: Html files with hyperlinks to the spectra images of the Non-Neutral class for the phosphopeptides identified from human liver
  (6) Last updated, Jul. 30, 2011.