Introduction

Bioinformatics

Crossover between Computer Science and Molecular Biology.

Biological information can be studied with approaches typical of information theory (Computer Science)
Understanding Biology requires Computer Science (huge data)
Computer Science useful to Biology

DNA sequencing is a very slow process. Initially it was extremely costly. Traditional sequencing was very precise but slow and costly. With the event of genome sequencing there was a change (still costly). Sequencing grew faster then the Moore Law. (100 genome project). Animal genome sequencing (mammals or anything useful for food ecc. ecc.), bacteria genome sequencing (understand resistances ecc. ecc.).

UniProtKB (knowledge database)
- Swiss-Prot (manually annotated and reviewed)
- TrEMBL (Transaltion of EMBL) (automatically annotated and not reviewed =>may contain sequencing errors)

The number of entries in the database grows exponentially.

Basic Paradigm of molecular biology is the sequent:

DNA => RNA => Protein

Nucleotides

DNA and RNA can be seen as a "4 letter alphabet" = (A,C,T/U,G) called information storage

Phosphate + Sugar + Nitrogenous base

Biological Sequences

How to turn a "4 letter alphabet" into a "20 letter alphabet" of the proteins?

You need triplets of nucleotides to encode a letter of the "protein alphabet"

RNA copies just a part of the whole DNA, just the one he needs to produce the protein needed

An RNA sequence is composed by one of the three "stop" codons

Proteins is where "things" become three-dimensional

Ammino Acids: Proteins

Carbon in the center
Hydrogen to the left
Acidic carboxyl group to the right
R group to the bottom
Amino group to the top

Proteins

Sequence => Structure => Function (!?)

Where:

Sequence = list of "letters"
Structure = Three-dimensional structure
Function = cavity of the structure (generated by the way it folds) is called the "active site", place where the "chemical magic" happens (adding or braking a chemical bond for example) (enzymes)

In silico methods

Term for computational methods (opposed to in vitro ecc. ecc.)

Prediction on structures

Bioinformatics to understand disease-associated aspects of protein structure and function

alignment and identification of homologous sequences
Characterization of the primary protein architecture
Prediction of secondary and tertiary structure
Exploration of protein interaction network
Analysis of binding sites for proteins and ligands

Alternative methods

Alternative viewpoints on proteins

Starting from a known protein sequence

Evolutionary model
- descriptive
- knowledge-based
Protein-folding model
- predictive
- Optimization ("Ab initio")

Arriving to the prediction of a structure

A lot of methods are knowledge-based (they require background knowledge on the field od study with training sets)
An alternative it is possible to build predictions based on simple observations and sets rules. These methods are called optimization and are able to generate new, not observed yet, solutions.

Elixir

Most major databases are projected to double in size every 12 months

The funds devoted to store the data is less than 1% of what is spent to generate it

Elixir coordinates, integrates and sustains bioinformatics resources across its member states and enables users in academia and industry to access services that are vital for their research.

Sites:

Elixir Hub (located in London):

Technical coordination across Nodes
Standards

Elixir Node (e.g. Padua)

Research and development of bioinformatics services
management of core resources

vision

Enstablish a distributed infrastructure that scales with the challenge
Secure and deliver the core data resources underpinning life-science research
Provide discoverable tools, services and connectors to drive data access and exploitation
Provide robust technical platforms and clouds for secure data access, data access and compute
Develop and maintain standards for data management, reuse and integration
Partner with user communities in a sustainable manner to ensure high and lasting impact
Close the computational biology skills gap through a comprehensive training programme for professionals
Support innovation in "big-data biology"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduction

Bioinformatics

Nucleotides

Biological Sequences

Ammino Acids: Proteins

Proteins

In silico methods

Bioinformatics to understand disease-associated aspects of protein structure and function

Alternative methods

Alternative viewpoints on proteins

Elixir

vision

FilesExpand file tree

1_Introduction.md

Latest commit

History

1_Introduction.md

File metadata and controls

Introduction

Bioinformatics

Nucleotides

Biological Sequences

Ammino Acids: Proteins

Proteins

In silico methods

Bioinformatics to understand disease-associated aspects of protein structure and function

Alternative methods

Alternative viewpoints on proteins

Elixir

vision