This paper introduces advanced phylogenetic algorithms designed to process pathogen genomic data at a massive scale, specifically focusing on the evolution of SARS-CoV-2. The authors address the challenges of computational demand and data inaccuracies by implementing new models within the MAPLE software to identify mutation rate variations and recurrent sequencing errors. By distinguishing between real evolutionary changes and technical artifacts, the framework improves the accuracy of tree reconstruction for datasets containing millions of sequences. The research culminates in the creation of a reliable global phylogeny of over two million genomes, providing a high-resolution map of the virus's spread. These methodological improvements enhance genomic epidemiology, offering essential tools for monitoring current and future infectious disease outbreaks.
References:
- De Maio N, Willemsen M, Martin S, et al. Rate variation and recurrent sequence errors in pandemic-scale phylogenetics[J]. Nature Methods, 2026: 1-9.

