Asymptotic properties of words in semi-Markov sequences Academic Article in Scopus uri icon

abstract

  • © 2022 World Scientific Publishing Company.It is well known that some enzymes, proteins, amino-acids between other biological molecules have more than one way to be coded in the DNA. That means, there are some biological molecules that can be identified by a set of sequences. For instance, the enzyme SmaI can be recognized by the words CCCGGG and GGGCCC. In this paper, we count the number of times that a biological sequence occurs through the DNA by any of its configurations, i.e. we provide the strong law of large numbers for a word sequence. To achieve our goal, we consider that DNA is modeled by an ergodic semi-Markov chain. We also present the Central Limit Theorem. Additionally, we compute the first hitting position of a set of words.

publication date

  • January 1, 2022