A Poisson model of sequence comparison and its application to coronavirus phylogeny

Xiaoqi Zheng; Yufang Qin; Jun Wang

doi:10.1016/j.mbs.2008.11.006

A Poisson model of sequence comparison and its application to coronavirus phylogeny

Math Biosci. 2009 Feb;217(2):159-66. doi: 10.1016/j.mbs.2008.11.006. Epub 2008 Dec 6.

Authors

Xiaoqi Zheng¹, Yufang Qin, Jun Wang

Affiliation

¹ Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, People's Republic of China.

Abstract

In this paper, we propose two metrics to compare DNA and protein sequences based on a Poisson model of word occurrences. Instead of comparing the frequencies of all fixed-length words in two sequences, we consider (1) the probability of 'generating' one sequence under the Poisson model estimated from the other; (2) their different expression levels of words. Phylogenetic trees of 25 viruses including SARS-CoVs are constructed to illustrate our approach.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Base Sequence
Coronaviridae / genetics*
DNA, Mitochondrial / genetics
DNA, Viral / genetics
Humans
Models, Genetic*
Phylogeny
Poisson Distribution*

Substances

DNA, Mitochondrial
DNA, Viral