Informatics
1 Mar 2025

Agreement Forests

Background

In Mathematics and Informatics, the comparison of hierarchical or sequentially branching structures is a central problem with applications ranging from data organization to evolutionary analysis. In particular, within phylogenetics and evolutionary state explorations, trees are commonly used to represent relationships between species, genes or states. However, trees inferred from different datasets, methods or observations often disagree due to noise, incomplete data, or differing assumptions.

To systematically quantify and reconcile such differences, the concept of an Agreement Forest has been introduced. Abstractly speaking, an Agreement Forest for two or more trees on the same set of leaves (final states or species), is a partition of these leaves such that all trees "look" the same if only focused on these different parts of leaves. Computing such an Agreement Forest is computationally challenging without further assumptions or insight into the underlying data. To make this even harder, one often wants to find a Maximum Agreement Forest, i.e. an Agreement Forest that has the least number of components which is the same as displaying the maximum agreement between the input trees.

As an optimization problem formulated on well known graph structures, this problem has been the subject of interest in Informatics, especially in the realm of algorithm design and fixed parameter tractability (FPT).

Project goal

The goal of this project is to investigate the theory, algorithms, and applications of Maximum Agreement Forests for comparing and reconciling tree-structured data.

In particular, we explore algorithmic strategies for computing MAFs efficiently, including exact, heuristic and parametrized approaches. Special attention will be given to identifying structural properties of real world instances to speed up the computation via insights into their effect on the graphs.

Finally, the project will consider implementation aspects, including data structures and software design, with the goal of producing efficient and reusable tools.

Further reading

  1. Li, Zhijiang, and Norbert Zeh. “Computing Maximum Agreement Forests without Cluster Partitioning Is Folly.” LIPIcs, Volume 87, ESA 2017 87 (2017): 56:1-56:14. Application/pdf, 14 pages, 706268 bytes. https://doi.org/10.4230/LIPICS.ESA.2017.56.

  2. Mestel, David, Steven Chaplick, Steven Kelk, and Ruben Meuwese. “Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests” arXiv:2409.18634. Preprint, arXiv, September 27, 2024. https://doi.org/10.48550/arXiv.2409.18634.

Hamburger icon
Menu
Advanced Concepts Team