Pptx in cloud computing of Tushar Kumar and Ashish kumar
tusharrohilla176
34 views
22 slides
Sep 05, 2024
Slide 1 of 22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
About This Presentation
Sj
Size: 5.02 MB
Language: en
Added: Sep 05, 2024
Slides: 22 pages
Slide Content
Bioinformatics and biological databases
Bioinformatics Bioinformatics is an emerging branch of biological science that emerged as a result of the combination of biology and information technology. It is a multidisciplinary subject where information technology is incorporated by means of various computational and analytical tools for the interpretation of biological data. It is a combination of various areas including Biology , Chemistry , Mathematics, Statistics, and Computer Science. Bioinformatics is about developing new technologies and software tools in the fields of medicine, biological research, and biotechnology .
Applications of Bioinformatics Bioinformatics and its application depend on taking out useful facts and figures from a collection of data reserved to be processed into useful information. Bioinformatics focuses its scope on the areas of 3D image processing, 3D modeling of living cells, image analysis, drug development, and a lot more . The most important application of bioinformatics can be seen in the field of medicine , where its data is used to create antidotes for infectious and harmful diseases . Some examples of the application of bioinformatics are as follows – Bioinformatics is largely used in gene therapy . This branch finds application in evolutionary concepts . Microbial analysis and computing. Understanding protein structure and modeling . Storage and retrieval of biotechnological data. In the finding of new drugs . In agriculture to understand crop patterns , pest control, and crop management.
Bioinformatics Subfields & Related Disciplines The area of bioinformatics incorporates a wide range of biotechnological sub-disciplines that are highlighted by both scientific ethics based on biological sciences and deep knowledge of computer science and information technology. Computational biology : The uses of data-based solutions to the issues in bioinformatics. Genetics: It is the study of heredity and the gene diversity of inherited characteristics/features. Genomics: It is the branch of biomolecular biology that works in the area of structure, function, evolution, and mapping of genomes . Proteomics: The study of proteomes and their features. Metagenomics : The study of genetics from the environment and living beings and samples. Transcriptomics: It is the study of the complete RNA and DNA transcriptase. Phylogenetics: The study of the relationships between groups of animals and humans. Metabolomics: The study of the biochemistry of metabolism and metabolites in living beings. Systems biology: Mathematical designing and analysis and visualization of large sets of biodata. Structural analysis: Modeling that determines the effects of physical loads on physical structures. Molecular modeling: The designing and defining of molecular structures by way of computational chemistry. Pathway analysis : A software description that defines related proteins in the metabolism of the body.
Scope of Bioinformatics The main scope of Bioinformatics is to fetch all the relevant data and process it into useful information . It also deals with – Management and analysis of a wide set of biological data. It is specially used in human genome sequencing where large sets of data are being handled. Bioinformatics plays a major role in the research and development of the biomedical field. Bioinformatics uses computational coding for several applications that involve finding gene and protein functions and sequences, developing evolutionary relationships, and analyzing the three-dimensional shapes of proteins. Research works based on genetic disease and microbial disease entirely depend on bioinformatics, where the derived information can be vital to produce personalised medicines.
Bioinformatics in Drug Discovery and Development Identifying new drug targets One of the significant applications of bioinformatics in drug discovery and development is the identification of new drug targets by analyzing large datasets of biological information using bioinformatics tools. For instance , researchers can use bioinformatics tools to analyze gene expression data from cancer patients and identify upregulated or downregulated genes in cancer cells. Scientists can then target specific genes with drugs to treat the disease. Similarly, we can use bioinformatics tools to analyze protein-protein interaction networks to identify proteins involved in disease pathways . Researchers can develop new treatments for various diseases by targeting these proteins with drugs. High-throughput screening assays test large numbers of compounds for their activity against a target. Bioinformaticians can analyze the results of these assays, and researchers can identify compounds that are most likely to be effective drugs. Predicting drug efficacy Another important application of bioinformatics in drug discovery and development is predicting the efficacy of drugs by analyzing drug interactions and biological molecules using bioinformatics tools. Bioinformatics can also be combined with experimental techniques such as X-ray crystallography and NMR spectroscopy to determine the structure of proteins and other biological molecules . By understanding the structure of these molecules, researchers can design drugs that interact with them more effectively.
Designing new drugs We can use Bioinformatics to design new drugs with greater precision. For example, researchers can use bioinformatics tools to predict the 3D structure of a protein molecule, which can help develop drugs that target specific proteins involved in diseases such as HIV, Alzheimer's, and cancer. By designing drugs that bind to particular protein parts, researchers can develop targeted therapies that are more effective and have fewer side effects. Clinical trials We can use Bioinformatics tools in clinical trials to analyze patient data and identify biomarkers to predict drug efficacy and toxicity. For example, we can analyze gene expression data from patients to identify biomarkers to predict how a patient will respond to a particular drug.
Genome Sequencing and Analysis of Crops and Livestock One of the major applications of bioinformatics in agriculture is the sequencing and analysis of the genomes of crops and livestock. By sequencing the genomes of crops and livestock, researchers can identify genes responsible for important traits such as yield, disease resistance, and nutritional content to develop more productive and sustainable farming practices. One example is identifying genes responsible for drought tolerance in crops such as maize and whea t. By selecting these genes in breeding programs, researchers can develop crops that are more resistant to drought, reducing the impact of climate change on agriculture. Similarly, researchers can identify genes responsible for disease resistance in livestock such as cattle and pigs. By selecting these genes in breeding programs, researchers can develop livestock more resistant to diseases, reducing the need for antibiotics and other treatments. Identification of Foodborne Pathogens Foodborne illness is a significant public health concern, and bioinformatics software tools can be used to identify foodborne pathogens, improving food safety and reducing the risk of foodborne illness. By analyzing the genomes of foodborne pathogens, researchers can identify specific genes and mutations associated with virulence and antibiotic resistance. For example, identify strains of Salmonella and E. coli resistant to antibiotics. By identifying these strains, food producers can take steps to prevent contamination and reduce the risk of outbreaks of foodborne illness. Bioinformatics in Agriculture
Identification of Allergens in Food Scientists can also use Bioinformatics to identify allergens in food, improving food safety for people with food allergies by identifying specific proteins responsible for allergic reactions by analyzing the genomes of common food allergens. One example is identifying proteins in peanuts responsible for allergic reactions. By determining these proteins, food producers can develop products that are less likely to cause allergic reactions and improve labeling to help people with food allergies avoid potentially dangerous foods. Precision Agriculture Precision agriculture is one area where bioinformatics can improve the efficiency and sustainability of farming practices. By analyzing data from sensors and other sources , bioinformatics tools can help farmers make more informed decisions about when to plant, fertilize, and harvest crops. For example, to analyze weather data and soil moisture levels to optimize irrigation schedules . By ensuring that crops are watered only when necessary, farmers can reduce water usage and improve crop yields. Prediction of Crop and Livestock Performance Bioinformatics helps predict crop and livestock performance, helping farmers make more informed decisions about which crops to plant and which livestock to breed . Analyzing data from multiple sources makes it possible to identify patterns and relationships that are difficult to detect manually. For example, it is possible to analyze genetic data from livestock to predict which animals are likely to produce the best meat or milk.
Bioinformatics in Environmental Sciences Biodiversity Studies One of the significant applications of bioinformatics in environmental sciences is studying biodiversity, meaning the variety of life forms in a specific area or planet. Biodiversity is essential for the functioning of ecosystems as it contributes to providing ecosystem services such as food, water purification, and pollination. By analyzing DNA sequences from different organisms, researchers can identify and classify species, investigate their evolutionary relationships, and study the genetic diversity of populations, which can provide insights into their adaptability and resilience to environmental change. For example, to analyze DNA sequences from different bird species to study their phylogenetic relationships and their adaptation to different environments . Or to analyze the distribution of genetic diversity within and between populations, which can provide insights into their evolutionary history and potential response to climate change. Ecosystem Studies Bioinformaticians can study ecosystems, including their structure, function, and interactions. Ecosystems are complex networks of living and non-living components that interact with each other to provide services essential for human well-being. By analyzing DNA sequences from different organisms in an ecosystem, researchers can identify the species present and study their interactions helping in understanding the role of various organisms in the ecosystem and their contribution to ecosystem services. For example, to analyze the DNA sequences of bacteria and fungi in soil samples to study their diversity and function to provide insights into the role of these microorganisms in soil fertility, nutrient cycling, and carbon sequestration.
Climate Change Studies Climate change refers to the long-term changes in temperature, precipitation, and other weather patterns over several decades. Bioinformatics software helps study the impacts of climate change on biodiversity and ecosystems. By studying the genetic diversity of populations and the interactions between species , researchers can identify the impacts of climate change on ecosystems and develop strategies for conservation and adaptation. For example, analyzing DNA sequences from different species populations to study their genetic diversity and potential to adapt to changing environmental conditions can help identify populations most vulnerable to climate change and develop strategies for their conservation. In addition, Bioinformatics can also be used to study the impacts of climate change on ecosystem services, such as carbon sequestration and water regulation . By analyzing the DNA sequences of different organisms in an ecosystem, researchers can identify the roles of various species in providing these services and develop strategies for their conservation.
Biological Databases Bi ology: a data-rich science Data: nucleotide sequences, the protein sequences, and the 3D structural data produced by X-ray crystallography and macromolecular NMR Other important data types includes metabolic pathways and molecular interactions, mutations and polymorphism in molecular sequences and structures as well as organelle structures and tissue types, genetic maps, physiochemical data, gene expression profiles, two dimensional DNA chip images of mRNA expression, two dimensional gel electrophoresis images of protein expression, data . A biological database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated.
F unctions of biological databases Make biological data available to scientists. As much as possible of a particular type of information should be available in one single place (book, site, and database). Published data may be difficult to find or access and collecting it from the literature is very time- consuming. And not all data is actually published explicitly in an article (genome sequences!). To make biological data available in computer-readable form. S ince analysis of biological data almost always involves computers, having the data in computer-readable form (rather than printed on paper) is a necessary first step.
Data Domains Types of data generated by molecular biology research: Nucleotide sequences (DNA and mRNA) Protein sequences 3-D protein structures Complete genomes and maps Also now have: Gene expression Genetic variation (polymorphisms)
Biological databases can be broadly classified into sequence and structure databases . Sequence databases are applicable to both nucleic acid sequences and protein sequences, whereas structure database is applicable to only Proteins. The first database was created within a short period after the Insulin protein sequence was made available in 1956. Incidentally, Insulin is the first protein to be sequenced . The sequence of Insulin consisted of just 51 residues (analogous to alphabets in a sentence) which characterize the sequence. Around mid nineteen sixties, the first nucleic acid sequence of Yeast tRNA with 77 bases (individual units of nucleic acids) was found out. During this period, three dimensional structures of proteins were studied and the well known Protein Data Bank was developed as the first protein structure database with only 10 entries in 1972 . This has now grown in to a large database with over 10,000 entries. T he development of a consolidated formal database known as SWISS-PROT protein sequence database was initiated in 1986 which now has about 70,000 protein sequences from more than 5000 model organisms, a small fraction of all known organisms.
P rimary, S econdary and C omposite databases A primary database contains information of the sequence or structure alone. Examples of these include Swiss- Prot & PIR for protein sequences , GenBank & DDBJ for Genome sequences and the Protein Databank for protein structures . A secondary database contains derived information from the primary database. A secondary sequence database contains information like the conserved sequence, signature sequence and active site residues of the protein families arrived by multiple sequence alignment of a set of related proteins. A secondary structure database contains entries of the PDB in an organized way. These contain entries that are classified according to their structure like all alpha proteins, all beta proteins , etc. These also contain information on conserved secondary structure motifs of a particular protein. Some of the secondary database created and hosted by various researchers at their individual laboratories includes SCOP, developed at Cambridge University; CATH developed at University College of London, PROSITE of Swiss Institute of Bioinformatics, eMOTIF at Stanford.
Composite database amalgamates a variety of different primary database sources, which obviates the need to search multiple resources. Different composite database use different primary database and different criteria in their search algorithm. Various options for search have also been incorporated in the composite database. The National Center for Biotechnology Information (NCBI) which hosts these nucleotide and protein databases in their large high available redundant array of computer servers, provides free access to the various persons involved in research. This also has link to OMIM (Online Mendelian Inheritance in Man) which contains information about the proteins involved in genetic diseases.
Primary Nucleotide Sequence Repository – GenBank, EMBL, DDBJ These are three chief databases that store and make available raw nucleic acid sequences. GenBank is physically located in the USA and is accessible through NCBI portal over internet. EMBL (European Molecular Biology Laboratory) is in UK and DDJB (DNA databank of Japan) is in Japan. They have uniform data formats (but not identical) and exchange data on daily basis.