StatAlign| MCMC beans| Instant coffee| Team| Documents| Contact


The beta version of the Phylogeny Cafe package is ready now. If you would like to test/use it, please send an email to

The Phylogeny Cafe is a Java-based software package for Bayesian phylogenetic analysis, structure and function prediction. The aim is to develop models and corresponding tools handling other mutation events than substitutions, for example, insertion-deletions, gene content evolution, genome rearrangements, etc. Therefore Phylogeny Cafe does not want to compete with popular phylogenetic software packages doing analysis based on only substitutions but would like to complement them.
Below is a list of the modules we are currently developing. The first full version release of Phylogeny Cafe is due in 2007 April.

The modules

StatAlign

StatAlign is an extendable software package for Bayesian analysis of Protein, DNA and RNA sequences. Multiple alignments, phylogenetic trees and evolutionary parameters are co-estimated in a Markov Chain Monte Carlo framework, allowing for reliable measurement of the accuracy of the results.

This approach eliminates common artifacts that traditional methods suffer from, at the cost of increased computational time. These artifacts include the dependency of the constructed phylogeny on a single (probably suboptimal) alignment and bias towards the guide tree upon which the alignment relies.

The models behind the analysis permit the comparison of evolutionarily distant sequences: the TKF92 insertion-deletion model can be coupled to an arbitrary substitution model. A broad range of models for nucleotide and amino acid data is included in the package and the plug-in management system ensures that new models can be easily added.

Click here to download StatAlign

MCMC beans

MCMC beans are Bayesian Markov chain Monte Carlo methods for inferring genome rearrangements, multiple alignments and gene content evolution.

Genome Rearrangement

The beta version can anlyize unichromosomal genomes with and without a constraint on maintaining the symmetry of the replication bouble. The log-likelihood trace, the distribution of mutation types and their length can be monitored on the fly during the analysis. The program creates a log file with loads of information about the analysis and can be used for further anaylsis.
The first full version release will be able to analyse both unichromosomal and multichromosomal genomes, and consensus network of the sampled phylogenies can be monitored on the fly.

Statistical Alignment/Structure Projector

The Statistical Alignment module samples from the posterior of multiple sequence alignments, given a model prior modeling the insertion-deletion process of sequences. Several posterior analysis is available for this module, too, the most remarkable is the Structure Projector, which maps the secondary structure of a protein sequence onto other protein sequences via the posterior distribution of multiple alignments thus predicting their secondary structures. Posterior probabilities clearly shows the reliable and less reliable regions, and in this way, Structure Projector is a useful tool for homology modeling.

Gene content evolution

With this module, one can analyze the gene content evolution of a set of genes using a three parameter model, allowing gene duplication, loss and horizontal transfer.

Instant coffee

The Instant coffee module is for those who do not want to wait for a long run of MCMC but would like to get a fast (thought probably less precise) analysis. Currently we are developing two modules, and others are on our plan list.

Reticular Alignment

The Reticular Alignment module is an iterative alignment tool, but aligns sets of optimal and suboptimal alignments to sets of alignments. The sets of alignments are represented with networks, hence the module's name.

Structure Decoder

The Structure Decoder module is the Instant Coffe version of Structure Projector. It makes only pairwise alignments using the TKF92 model, and the average posterior decoding probabilities are plotted for sequences. Low values shows the non-structural elements of the protein while high values shows conservated parts.

The team

   Miklós Csürös
MSc at the Technical University, Budapest, PhD in bioinformatics at the Yale University, Assistant Professor at the University of Montreal. Working on the Gene content evolution and Reticular Alignment projects.

   Gerton Lunter
MRC fellow, University of Oxford. Working on the Reticular Alignment project.

   István Miklós
MSc in biology,chemistry and Maths, Eötvös University, Budapest, PhD in theoretical Biology, Eötvös University, Budapest, postdoc at the Department of Statistics, University of Oxford, now affiliated at the eScience Regional Knowledge Center, Eötvös University, Budapest. Project coordinator.

   Timothy Brooks Paige
BSc student in the Amherst college. Worked on the Genome Rearrangement project in 2005 summer in the Rényi Institute, where he was a summer student supervised by István Miklós.

   Ádám Novák
Graduated in 2007 with an MSc degree in Computer Science. Besides, he has strong background in Biology, Mathematics and Bioinformatics as well as extensive IT industrial experience. He has joined the the Statistical Alignment project of the Department of Statistics at Oxford University in 2007. He is working on the Statistical Alignment/Structure Projector module.

Documents

Posters and presentation at the RECOMB2006 conference

Contact