RitikaChoudhary57
1,916 views
14 slides
Nov 22, 2022
Slide 1 of 14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
About This Presentation
Bioinformatics
Size: 2.56 MB
Language: en
Added: Nov 22, 2022
Slides: 14 pages
Slide Content
STRING
INTRODUCTION The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself.
FUNDING The Swiss Institute of Bioinformatics provides long-term core funding for STRING, as do the Novo Nordisk Foundation and the European Molecular Biology Laboratory (EMBL Heidelberg). N.D.T. received funding from the Danish Council for Independent Research , and A.J. from the National Institutes of Health (NIH) Illuminating the Druggable Genome Knowledge Management Center . J.H.M. was funded by the NIH , by grant number 2018-183120 from the Chan Zuckerberg Initiative DAF, and by the advised fund of the Silicon Valley Community Foundation. Incorporation into the German bioinformatics infrastructure has been enabled by the BMBF . Funding for Open Access charges: University of Zurich.
USAGE Protein–protein interaction networks are an important ingredient for the system-level understanding of cellular processes. Such networks can be used for filtering and assessing functional genomics data and for providing an intuitive platform for annotating structural, functional and evolutionary properties of proteins. Exploring the predicted interaction networks can suggest new directions for future experimental research and provide cross-species predictions for efficient interaction mapping.
FEATURE The data is weighted and integrated and a confidence score is calculated for all protein interactions. Results of the various computational predictions can be inspected from different designated views. There are two modes of STRING: Protein-mode and COG -mode. Predicted interactions are propagated to proteins in other organisms for which interaction has been described by inference of orthology . A web interface is available to access the data and to give a fast overview of the proteins and their interactions. A plug-in for cytoscape to use STRING data is available. Another possibility to access data STRING is to use the application programming interface (API) by constructing a URL that contain the request.
DATA SOURCES Like many other databases that store protein association knowledge, STRING imports data from experimentally derived protein–protein interactions through literature curation. Furthermore, STRING also store computationally predicted interactions from: ( i ) text mining of scientific texts (ii) interactions computed from genomic features (iii) interactions transferred from model organisms based on orthology . All predicted or imported interactions are benchmarked against a common reference of functional partnership as annotated by KEGG (Kyoto Encyclopedia of Genes and Genomes).
. 1 Imported data STRING imports protein association knowledge from databases of physical interaction and databases of curated biological pathway knowledge ( MINT , HPRD , BIND , DIP , BioGRID , KEGG , Reactome , IntAct , EcoCyc , NCI-Nature Pathway Interaction Database , GO ). Links are supplied to the originating data of the respective experimental repositories and database resources. 2 Text mining A large body of scientific texts ( SGD , OMIM , FlyBase , PubMed ) are parsed to search for statistically relevant co-occurrences of gene names.
. 3 Predicted data Neighborhood: Similar genomic context in different species suggest a similar function of the proteins. Fusion-fission events: Proteins that are fused in some genomes are very likely to be functionally linked (as in other genomes where the genes are not fused). Occurrence: Proteins that have a similar function or an occurrence in the same metabolic pathway, must be expressed together and have similar phylogenetic profile . Coexpression : Predicted association between genes based on observed patterns of simultaneous expression of genes.
HOME PAGE OF STRING A B C
. ON CLICKING ON CONTINUE YOU WILL DIRECTLY GO TO THE MAIN PAGE OF STRING HAVING ALL THE DATA.
MAIN PAGE OF STRING Here from clicking on different options every information can be obtained.
OPTIONS ON MAIN PAGE By clicking on viewers option here different other options for information of protein protein interaction can be taken.
. In the legend option the information we get is about nodes, edges, confidence scores,predicted functional partners etc.
. . PROTEIN B PROTEIN A PROTEIN C PROTEIN D NODE EDGES(String interaction score) CONFIDENCE SCORE:- Combines the probabilities from the different evidence channels like conserved neighborhood, Gene fusions, Phylogenetic co-occurrence, Co-expression, literature Co-occurrence etc. and corrected for the probability of randomly observing an interation . LOW- (0-0.4) MEDIUM- (0.4-0.7) HIGH- (0.7-1.0)