Welcome to Campylobacter jejuni Interactions Database
Version 3.1, last updated 6/20/2007.
The Campylobacter Interactions Database includes protein interaction data from a large-scale yeast two-hybrid (YTH) screen and interactions predicted from experimental data in other organisms (interologs). The data is described in Parrish et al., 2007 (PMID: 17615063). The data can be searched, integrated, graphed, and downloaded using IM Browser (Pacifico et al., BMC Bioinformatics, 7:195 doi:10.1186/1471-2105-7-195, 2006 [Full text]).
Below is a brief description of the various data sets currently available and how they are assembled. Definitions of the fields in each data set can be found further below.
C.jejuni YTH - Includes protein interaction data generated in the Finley laboratory using the LexA yeast two-hybrid system, mostly from high throughput screens. The project is described here. Data versions are as follows.
CampyYTH v3.1 - 06/20/2007 - This version includes the 12,012 interactions detected in a high throughput yeast two-hybrid screen (Parrish, 2007). The interactions involve 1,321 proteins, or nearly 80% of the predicted proteome. All of the interactions were assigned confidence scores, with roughly 27% of them in the high confidence set (scores >0.5).
CampyYTH v4.1 - This version expands v3.1 with the inclusion of results from binary protein interaction assays testing interolog predictions.
Interolog data - Predicted interactions between C. jejuni proteins based on experimental evidence for interactions between orthologous proteins in other species. Ortholog mappings were determined using Clusters of Orthologous Groups (COGs) from NCBI. The detection of the original protein interactions are described briefly and referenced below.
Predictions from H. pylori - H. pylori interactions were downloaded from DIP. The interactions arose from a large yeast two-hybrid study that used 261 bait proteins to detect > 1,200 protein interactions. This data was described in Rain et al., 2001, Nature, 409:211-215. From this protein interaction data we have predicted 1,165 C. jejuni protein interactions (interologs).
Predictions from E. coli. - E. coli protein interactions were downloaded from DIP. The majority of the interactions arose from two large protein complex pull-down studies described in Butland et al., 2005, Nature, 433:531-537; and in Arifuzzaman et al., 2006, Genome Research, 16:686-691. Interactions from these two studies were predicted by pairing the bait protein used to pull-down the complex with each protein member of the complex (i.e., the 'hub and spoke' method). From the resulting protein interaction data we have predicted 3,743, and 4,056 C. jejuni protein interaction interologs, respectively.
The Campylobacter Interactions Database contains two types of tables. Most tables store interaction data; there is one table which stores Campylobacter gene attribute data. Table column names (used in downloaded text files), their short descriptive names (used in IM Browser when right clicking an interaction and choosing 'Edge attributes'), and their explanations are provided below for reference purpose.
C. jejuni Yeast Two Hybrid Data
BD Gene - GENE1 was fused to a DNA Binding Domain.
AD Gene - GENE2 was fused to an Activation Domain.
GENE1_INTERACTIONS_AS_BD (Interactions as BD) - Total number of interactions for GENE1 when it was fused to a DNA Binding Domain.
GENE1_INTERACTIONS_TOTAL - Total number of interactions involving.
GENE1 when it was fused to DNA binding domain or an Activation domain.
GENE2_INTERACTIONS_AS_AD (Interactions as AD) - Total number of interactions for GENE2 when it was fused to an Activation Domain.
GENE2_INTERACITONS_TOTAL - Total number of interactions involving GENE2 when it was fused to an Activation Domain or a DNA Binding Domain.
SCREEN (Screen) - Original interaction screen. Note that interactions may have been detected in multiple screens, but only the original is listed. The number of total times the interaction was detected is given in "IST_RFCS" and "MATRIX_DETECTIONS".
REFERENCE (Reference) - Literature reference (PubMed ID) for this data set.
DATE1 (Date) - Date this data was first published.
C_LEU (C_LEU) - Integer numbers representing leu2 reporter gene signal strength. The LEU2 reporter activity was scored by growth in the absence of leucine, on a 0-3 scale. The background activity due to the BD alone was subtracted.
C_LACZ (C_LACZ) - Integer numbers representing lacZ reporter gene signal strength. lacZ reporter activity was scored as the level of blue colors on X-Gal plates on a scale of 0 (white) to 5 (dark blue). The background activity due to the BD alone was subtracted
C_SUM (Strength) - C_SUM is a sum of the leu and lacZ scores after the background has been subtracted, which is taken as an indicator of overall two-hybrid reporter activity (range 0-8).
Confidence Score - Confidence score (0-1) was generated using a statistical model that determined attributes that correlate with the likelihood of being in the true or false positive training set. The dividing line between high confidence and low confidence interactions was set to 0.5.
MATRIX (Matrix) - Interaction screening method in which the final interaction is verified in a one-on-one mating. Indicates whether this interaction was detected in a matrix screen (Yes or No). A single detection is reproducible, and thus the number of times detected is not particularly relevant. Occasionally, the same interaction was detected in a different screen, in which case the number of Matrix_Detections is greater than 2.
IST (IST) - Interaction Sequence Tag. Indicates whether this interaction was detected in a library screen (Yes or No). In such a screen, after mating a BD strain with the AD library, individual yeast clones are selected based on reporter expression and the interacting AD fusion is sequenced. The same AD fusion can be identified several times; multiple ISTs for a given interaction are less likely to represent a false positive than a single IST.
MATRIX_DETECTIONS (Matrix Detections) - The number of times this interaction was detected in one-on-one "matrix" assays.
ISTS_RFCS (ISTS_RFCS) - This is essentially the total number of AD clones that were identified for the particular interaction. It is the sum of the number of ISTs and clones with identical restriction fragment class (RFC) as the IST clone.
DATA_VERSION (Version) - Version of current data.
DATE_LAST_UPDATED (Date last updated) - Date of most recent update.
Sequenced AD - Refers to validation of the AD gene fusion.
Sequenced BD - Refers to validation of the BD gene fusion.
Interaction predictions from H. pylori
CjGene1 - C. jejuni gene of the first gene in the predicted interaction.
CjGene2 - C. jejuni gene of the second gene in the predicted interaction.
HpGene1 - H. pylori gene corresponding to the CjGene1.
HpGene2 - H. pylori gene corresponding to the CjGene2.
Hp_Protein_Uniprot1 - H. pylori UniProtKB accession number corresponding to CjGene1.
Hp_Protein_Uniprot2 - H. pylori UniProtKB accession number corresponding to CjGene1.
ORTHOLOG_METHOD (Ortholog Method) - See definition for Yeast Interologs.
INTERACTION_PUBMEDS (Interaction Pubmed IDs) - See definition for Yeast Interologs.
INTERACTION_DETECT_METHODS (Interaction Detection Methods) - See definition for Yeast Interologs.
DATE_LAST_UPDATED (Date last updated) - Date of most recent update.
DATA_VERSION (Version) - Version of current data.
ORIGINAL_INTERACTION_SOURCE (Original interaction source) - See definition for Yeast Interologs.
Interaction predictions from E. coli
Gene Attributes
Cj Gene ID (Sanger C. jejuni ORF ID) - Primary C. jejuni gene ID.
SYMBOL (Symbol) - Gene symbol.
URL (URL) - Web link to CampyDB page describing this gene.
GENE_CLASS (Class of Gene) - Class of the gene.
Function Description
UniProt ID
AA Seq Len
CDS Len
KEGG Pathway ID
PFAM
COG ID
Hp Ortholog ID
Hp Similarity
Ec Ortholog ID
Ec Similarity
Chromosome Positions
Sanger Functional Classification
GO_MOLECULAR_FUNCTION (GO Molecular Function) - Gene Ontology (GO) Molecular Function annotations. It was formatted as GO_id(GO_evidence)===GO_term,GO_id(GO_evidence)===GO_term... .
GO_BIOLOGICAL_PROCESS (GO Biological Process) - GO Biological Function annotations,same format as Molecular Functions.
GO_CELLULAR_PROCESS (GO Cellular Component) - GO Cellular Component annotations, same format as Molecular Functions.
SYNONYMS (Synonyms) - Synonyms of the gene.
PROTEIN_DOMAINS (Protein Domains) - Protein domain annotations obtained from Interpro.
CG_SYMBOLS (CG Symbol) - CG symbols associated with this gene.
DATE_LAST_UPDATED (Date last updated) - Date of most recent update.
DATA_VERSION
|