[AD]INFORMATION DECOMPOSITION OF SYMBOLICAL SEQUENCES

Data Bases of DNA and Protein Sequences with Latent Periodicity

 

Bioinformatic group of the Center of Bioengineering of the Russian Academy of Sciences

 

 

Bioinformatic Education in Moscow Physical Engineering Institute

 

We developed the method of Information Decomposition (ID) of a content of any symbolical sequence. ID method does not change the statistical properties of symbolical sequence and calculates the information autocorrelation function. The method is based on the calculation of Shannon mutual information between analyzed and artificial symbolical sequences, and permits to reveal rather latent periodicity in any symbolical sequence that can not be found by all developed before mathematical methods. Using this method we analyzed the GENBANK, full sequenced genomes and SWISS-PROT data banks using the supercomputer clusters and found thousands of gene and proteins with different types of latent periodicity.

We show the stability of ID method in the case of lot of random letter changes in analysed symbolic sequence. We demonstrate the efficiency of the method, analyzing both poems, and DNA and protein sequences. In poems of A. Puskin and W. Shakespeare we found a latent periodicity of different lengths that can be reflections a periodicity of poem sounds. In DNA and protein sequences we show the existence the lot of DNA and amino acid sequences with different types and lengths of the latent periodicity. We found the latent periodicity of 93% of tyrosine and serine protein kinases and the latent periodicity of 86% of proteins that contain NAD+ site in Swiss-prot data bank.

 

For more information about our developed methods of the DNA similarity search please see:

 

 

1.        M B Chaley, E V Korotkov, D A Phoenix Relationships among isoacceptor tRNAs seems to support the coevolution theory of the origin of the genetic code. Journal of molecular evolution. 03/1999; 48(2):168-77

2.        E.V.Korotkov "New family wide spread mirror-reflected MB1 repeats in human genome", Molec.Biol. (USSR), V.25, P.250-263, 1991

  1. E.V.Korotkov "MB1 family repeats in genomes of many mammals", Izvestia of Akad. Sci. of USA, Seria Biology, No.4, 546-557, 1992.
  2. Korotkov E.V. "Fast method of homology and purine pyrimidine mutual relations between DNA sequences search" DNA Sequence V.4, 413-415, 1994
  3. Korotkov E.V., Korotkova M.A."Enlarged similarity of nucleic acids sequences", DNA Research, v.3, N.3, 157-164, 1996
  4. Korotkov E.V. and Korotkova M.A. УMIRs: family repeats that is common for many vertebratesФ. Mol.Biol. v.34.,348-353, 2000.
  5. Korotkov E.V. and Korotkova M.A. УStudy of the presence MIRs in the human 22 chromosomeФ. Mol.Biol. , v.34, 376-382, 2001
  6. Chaley M.B., Korotkov E.V. УEvolution of MIR elements located in the coding regions of human genomeФ, Mol. Biol.а v.35, 874-882, 2001.
  7. Data banks
  8. Chaley M.B., Frenkel F.E., Korotkov E.V. Skryabin K.G. Revealing and Functional Analysis of tRNA-like Sequences in Various Genomes. Gene, 335, 57-71, 2004

а

а

 

More details for mathematical method of Informational Decomposition (ID) and data bases you could receive from publications:

1.        http://bioinf.narod.ru\Pub/korotkov.pdf

2.        Korotkov E.V. and Korotkova M.A. "DNA regions with latent periodicity in some human clones", DNA Sequence, V.5, pp.353-358, 1995.

3.        Korotkov E.V., Korotkova M.A., Tulko J.S. " Latent sequence perio- dicity of some oncogenes and DNA-binding protein genes", CABIOS,v.13, pp.37-44, 1997

4.        Korotkova M.A., Korotkov E.V. and RudenkoV.M. "Latent periodicity of protein sequences", Journal of Molecular Modelling, v.5, pp.103-115, 1999

5.        Korotkov E.V., Korotkova M.A., Rudenko V.M. and Skryabin K.G., "Latent periodicity of the protein sequences" Molecularnya Biologya (Russian), v.33, pp.611-617, 1999.

6.      Chaley M.B., Korotkov E.V. and Skryabin K.G. "Method reavealing latent periodicity of the nucleotide sequences modified for a case of small samples" DNA Research, 6, 153-163, 1999.

7.        Chaley MB, Korotkov EV, Kudryashov NA УLatent periodicity of 21 bases typical for MCP II gene is widely present in various genesФ DNA Sequence, 14, 33-52,2003

8.        Korotkov EV, Korotkova MA, Kudryashov NA Information decomposition of symbolical texts. Los-Alamos Arxiv,а 0302195

9.        Korotkov EV, Korotkova MA, Kudryshov NA Information decomposition method for analysis of symbolical sequences. Physical Letters A, v.312, 198-210, 2003.

10.     Korotkov EV, Korotkova MA, Kudryashov NA лInformation approach for determination of periodicity of genetic texts╗ Molec. Biology (Russian) v.37, N3, pp.436-451, 2003,

11.     аLaskin AA, Chaley MB, Korotkov EV and Kudryashov NA Identification of NAD+ regions in the amino acid sequences of different proteinsФ Molec. Biology (Russian) v.37, N4, pp.663-673, 2003.

12.     аA.A. Laskin, E.V. Korotkov, N.A. Kudryashovа ФLatent periodicity of many domains in rotein sequences reflects their structure, function and evolutionФ.а pp. 135-144, in УBIOINFORMATICS OF GENOME REGULATION AND STRUCTUREФ, N.Kolchanov andа R.Hofestaedt edТs, Kluwer press, 2004

13.     Korotkov E.V. Enzyme as a thermal resonance pump.

14.     аLaskin AA, Kudryashov NA, Skryabin KG, Korotkov EV.а Latent periodicity of serine-threonine and tyrosine protein kinases and other protein families. Comput Biol Chem. 2005 29(3):229-243

15.     Turutina VP, Laskin AA,а Skryabin K.G., Kudryashov N.A. and Korotkov EV,а "Latent periodicity of many protein families", Biochemistry, 2006, 71,18-31.

16.     Shelenkov A, Skryabin K, Korotkov EV.ааааааааа Search and classification of potential minisatellite sequences from bacterial genomes. DNA Res. v. 13(3):89-102. 2006.

17.     Turutina VP, Laskin AA,а Skryabin K.G., Kudryashov N.A. and Korotkov EV, "Latent periodicity of 94 protein families", J. Compt. Biol. 2006, v.13:946-964.

18.     Laskin AA, Skryabin KG, Korotkov EV Latent periodicity of protein families, identified with the indel-aware algorithm.J Proteome Res. 2007 v.6, 862-868.

19.     Shelenkov A, Korotkov A, Korotkov E. MMsat-a database of potential micro- and minisatellites. 409, 53-60, Gene. 2008.

20.     Shelenkov AA, Skryabin KG, Korotkov EV Search and classification of potential minisatellite sequences from plants genomes, Genetikaа (Rus),а v.44, pp.120-136, 2008.

21.     аShelenkov AA, Korotkov EV The search of regular sequences in promoters from different species with help of run test. Mathematical Biology and Bioinformatics, v.3, N.1, pp.1-15, 2008.

22.     Frenkel FE, Korotkov EV. Classification analysis of triplet periodicity in protein-coding regions of genes. Gene. 2008. 15;421(1-2):52-60. 2008.

23.     F. E. Frenkel', E. V. Korotkovаа Classification of triplet periodicity of gene sequences form KEGG data bank.аа Molekulyarnaya biologiya, v.42, ╣ 4, pp. 707-720. 2008а

24.     Korotkov E.V., Rudenko V.M. УPhase shift of triplet periodicityin gene sequences.а Mathematical Biology and Bioinformatics. (Russian) а2009.v. 4. ╣ 2. pp. 66-80.

25.     Shelenkov A, Korotkov E. Search of regular sequences in promoters from eukaryotic genomes. Comput Biol Chem. 2009;33:196-204

26.     Frenkel FE, Korotkov EV. Using triplet periodicity of nucleotide sequences for finding potential reading frame shifts in genes. DNA Res. 2009;16:105-114.

27.     EV Korotkov, MA Korotkova аУBioinformatics andа search of shifts of reading frame in genesФ Information technologies and computation systems (Russian), ╣1, pp.1-23, 2010.

28.     EV Korotkov, MA Korotkova лStudy of the triplet periodicity phase shifts in genes, Journal of Integrative Bioinformatics, v.7,131-141, 2010

29.     Y.M..Suvorova, Korotkov E.V. Splicing of the triplet periodicity in genes from different species. In Proceedings on the 6th International Symposium of Health Informatics and Bioinformatics, Izmir, Turkey, 2-5 May 2011 (http://hibit.iyte.edu.tr), pp.246-250

30.     Rudenko V.M. and Korotkov E.V. УSearch of latent periodicity in the financial time series by the cyclic decomposition methodФ, Applied Informatics (in Russian), ╣ 3, 2011 а

31.     Korotkov E.V. Rudenko V.M and Suvoriva Yu.G. УUsing of triplet periodicity of DNA sequences for search of spliced genesФ in press.

 

 

 

 

 

Contacts: genekorotkov@gmail.com

 

 

Hosted by uCoz
[AD]