Group leader : B. Habermann
The Computational Biology group adresses biological problems using computational methods. We have two major research directions:
- we are interested in large-scale data integration, focusing on mitochondrial function in development and disease
- we are developing methods to work with remote sequence similarities, with the focus on de novo prediction of short, functional motifs in proteins.
Computer-driven analysis has complemented biological research for a long time. With the beginning of sequencing and deciphering the genetic code, methods were developed that help analyze this data. The sequencing of parts of or entire genomes has for instance enabled us to establish a fine-tuned view on the evolution of species.
In recent years, large-scale screens and next generation sequencing is generating a tremendous amount of data, which cannot be analyzed – or understood – without the help of computational techniques. Our lab is working in computer-assisted analysis of biological data.
The Computational Biology group is actively involved in two main research directions: the integration of large-scale data and working with remote protein sequence similarities.
Research interest 1: Integration of large-scale data
We work on the integration of large-scale data from different sources to extract meaningful biological information. We use mostly NGS-data, integrating differential expression with ChIP-seq or interactome data to provide biologists with testable hypothesis for further experimental studies. To this end, we develop data integration methods that are easy to use for non-experts.
To show feasibility of our methods, we have chosen the mitochondrial system, as it represents the central organelle for metabolic functions and energy production in the cell. It is experimentally very well characterized in terms of protein content and enzymatic pathways. Therefore, it enables us to look at changes in mitochondrial function in differing cellular conditions.
MitoXplorer: understanding mitochondrial function in health and disease
We are developing the MitoXplorer platform, an integrative web-tool to integrate large-scale expression and mutation data with the mitochondrial interactome and mitochondrial processes. Using specialized pipelines for NGS-data analysis, we extract mutation and expression data for all proteins localized to mitochondria and having mitochondrial function, irrespective of their genomic localization (mitochondrial or nuclear genome). We integrate expression and mutation data with a manually assembled and curated mitochondrial interactome and visualize observed changes in different experimental or disease conditions. This enables us to rapidly and visually compare different data-sets with respect to their mitochondrial functions.
This project is supported by DFG grant ‘Systems biological analysis of cancer genomes using deductive databases’ and the ANR grant ‘MITO-DYNAMICS’.
Biological networks for data analysis, integration and visualization
Biological networks such as protein-protein interaction networks or gene regulatory networks are an integral part to understand biological systems. We use such networks to interpret and integrate large-scale data coming from expression studies. We have developed several algorithms for network analysis and visualization:
1) miMerge and miScore for the generation of non-redundant protein interaction networks (Villaveces, et al., Database, 2015)
2) KEGGViewer (Villaveces, et al., F1000Res 3:43, 2014) for the visualization and integration of pathway data; and PsiquicGraph (Villaveces, et al., F1000Res 3:44, 2014) both available via the BioJS platform;
3) the Cytoscape plugins viPEr for generating focus networks based on -omics data and PEANUT for pathway enrichment of focus networks (Garmhausen et al., BMC Genomics 16:790, 2015)
Research interest 2: Working with remote sequence similarity – motif de novo prediction and orthology detection in the midnight zone of sequence similarity
Our Darwinian view on evolution states that evolution is the result of random changes of our genetic code combined with the process of natural selection. Many small changes over a long period of time have a major evolutionary impact. As a result, even true orthologs can share only low sequence similarity, which we refer to as conservation in the twilight or midnight zone.
Our group is interested in detecting sequence relationships in the twilight and midnight zone.
HH-MOTiF: de novo detection of functional short linear motifs in proteins
Protein motifs are defined as self-sufficient functional units. They are typically only between 3 and 23 amino acids long and have various functions in proteins. They can serve as cleavage sites, are required for proteasomal degradation, are involved in docking and ligand binding, serve as signals for post-translational modification or are signals for subcellular localization.
Their shortness and the fact that they typically lack substantial sequence conservation makes them very difficult to find de novo – i.e. without prior information on the localization or nature of the motif. We are using evolutionary restricted Hidden Markov Model (HMM) comparison in combination with a hierarchical model of motif trees to identify short functional motifs in proteins de novo (Prytuliak, et al., NAR 45 (W1):W470-W477, 2017). In collaboration with wet-lab researchers, we experimentally test our predicted motifs.
If you are interested, please visit the link.
morFeus: remote orthology detection
We are interested in discovering remote orthologs. Identifying orthologous proteins is one of the key tasks in computational biology: we need to know a protein’s orthologs to understand its evolution. Orthologs also tell us, whether the process a protein is involved in, is conserved beyond model species and across kingdoms.
Orthologs are equally important for wet-lab research: we transfer functional information across orthologous proteins and can therefore provide testable hypothesis for a protein’s function for uncharacterized proteins.
The level of sequence conservation even between orthologs is however sometimes below the detection limit of standard software and settings.
We have addressed this problem and developed a web-based method, morFeus (Wagner, et al., BMC Bioinformatics 15 (1), 263, 2014) for the detection of orthologs in the twilight and midnight zone of sequence similarity.
We compare weighted, binary representations of sequence alignments from a relaxed BLAST search and cluster hits based on their similarity to the query. Iterative reciprocal BLAST searches are carried out to verify orthology. Not only the query, but also other verified orthologs can establish orthology and include further hits for back-BLASTs. In a final step, a network of orthology (see figure) is created and a score independent of the BLAST E-value is calculated for putative orthologs using centrality scoring. We have tested morFeus against the state-of-the-art resources HomoloGene and Inparanoid and achieve significantly higher sensitivity with equal specificity.
May 30th, 2018
A transcriptomics resource reveals a transcriptional transition during ordered sarcomere morphogenesis in flight muscle.
April 24th, 2018
Integrative analysis and machine learning on cancer genomics data using the Cancer Systems Biology Database (CancerSysDB).
March 14th, 2018
The deregulated microRNAome contributes to the cellular response to aneuploidy.
January 28th, 2018
SLALOM, a flexible method for the identification and statistical analysis of overlapping continuous sequence elements in sequence- and time-series data
January 24th, 2018
The axolotl genome and the evolution of key tissue formation regulators.
January 5th, 2018
The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.
April 29th, 2017
HH-MOTiF: de novo detection of short linear motifs in proteins by Hidden Markov Model comparisons
March 27th, 2017
Revision and reannotation of the Halomonas elongata DSM 2581T genome.
March 9th, 2017
A Guide to Computational Methods for Predicting Mitochondrial Localization.
September 21st, 2016
Oh Brother, Where Art Thou? Finding Orthologs in the Twilight and Midnight Zones of Sequence Similarity
October 14th, 2015
Virtual pathway explorer (viPEr) and pathway enrichment analysis tool (PEANuT): creating and analyzing focus networks to identify cross-talk between molecules and pathways.
June 4th, 2015
Tools for visualization and analysis of molecular networks, pathways, and -omics data.
February 4th, 2015
Merging and scoring molecular interactions utilising existing community standards: tools, use-cases and a case study.
August 6th, 2014
morFeus: a web-based program to detect remotely conserved orthologs using symmetrical best hits and orthology network scoring.
February 13th, 2014
KEGGViewer, a BioJS component to visualize KEGG Pathways.
February 13th, 2014
PsicquicGraph, a BioJS component to visualize molecular interactions from PSICQUIC servers.
August 29th, 2012
Designing efficient and specific endoribonuclease-prepared siRNAs.
March 10th, 2011
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
June 5th, 2010
SeLOX--a locus of recombination site search tool for the detection and directed evolution of site-specific recombination systems.
March 11th, 2007
Genome-wide resources of endoribonuclease-prepared short interfering RNAs for specific loss-of-function studies.
October 23rd, 2006
ProFAT: a web-based tool for the functional annotation of protein sequences.
August 13th, 2004
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries.
July 1st, 2004
DEQOR: a web-based tool for the design and quality control of siRNAs.
March 5th, 2004
The BAR-domain family of proteins: a case of bending and binding?
March 1st, 2004
The power and the limitations of cross-species protein identification by mass spectrometry-driven sequence similarity searches.
August 8th, 2018
Hypermethylation of gene body CpG islands predicts high dosage of functional oncogenes in liver cancer
January 15th, 2018
High-resolution TADs reveal DNA sequences underlying genome organization in flies.
July 7th, 2016
Structure of a Cytoplasmic 11-Subunit RNA Exosome Complex.
May 9th, 2016
Secretory cargo sorting by Ca2+-dependent Cab45 oligomerization at the trans-Golgi network.
December 18th, 2015
Human Holliday junction resolvase GEN1 uses a chromodomain for efficient DNA recognition and cleavage.
February 16th, 2015
The RNA-binding protein Arrest (Bruno) regulates alternative splicing to enable myofibril maturation in Drosophila flight muscle.
April 2nd, 2013