Главная страница Карта сайта Контактная информация




Главная » Наука » Публикации »
 

 

BMC Evol Biol. 2006 Jun 22;6:51. PubMed; doi 10.1186/1471-2148-6-51

PHOG-BLAST--a new generation tool for fast similarity search of protein families.

Merkeev, I. V.; Mironov, A. A.

BACKGROUND: The need to compare protein profiles frequently arises in various protein research areas: comparison of protein families, domain searches, resolution of orthology and paralogy. The existing fast algorithms can only compare a protein sequence with a protein sequence and a profile with a sequence. Algorithms to compare profiles use dynamic programming and complex scoring functions. RESULTS: We developed a new algorithm called PHOG-BLAST for fast similarity search of profiles. This algorithm uses profile discretization to convert a profile to a finite alphabet and utilizes hashing for fast search. To determine the optimal alphabet, we analyzed columns in reliable multiple alignments and obtained column clusters in the 20-dimensional profile space by applying a special clustering procedure. We show that the clustering procedure works best if its parameters are chosen so that 20 profile clusters are obtained which can be interpreted as ancestral amino acid residues. With these clusters, only less than 2% of columns in multiple alignments are out of clusters. We tested the performance of PHOG-BLAST vs. PSI-BLAST on three well-known databases of multiple alignments: COG, PFAM and BALIBASE. On the COG database both algorithms showed the same performance, on PFAM and BALIBASE PHOG-BLAST was much superior to PSI-BLAST. PHOG-BLAST required 10-20 times less computer memory and computation time than PSI-BLAST. CONCLUSION: Since PHOG-BLAST can compare multiple alignments of protein families, it can be used in different areas of comparative proteomics and protein evolution. For example, PHOG-BLAST helped to build the PHOG database of phylogenetic orthologous groups. An essential step in building this database was comparing protein complements of different species and orthologous groups of different taxons on a personal computer in reasonable time. When it is applied to detect weak similarity between protein families, PHOG-BLAST is less precise than rigorous profile-profile comparison method, though it runs much faster and can be used as a hit pre-selecting tool.



 


  Московский Государственный Университет имени М.В.Ломоносова



Почтовый адрес:
119991 г. Москва, ГСП-1, Ленинские горы МГУ 1, стр. 73,
Факультет биоинженерии и биоинформатики, комната 433.

Телефон / факс: +7 (495) 939-41-95
Справочная телефонов МГУ +7 (495) 939-10-00

E-mail: bioeng@genebee.msu.ru

© 2011 Факультет биоинженерии и биоинформатики
Московского Государственного Университета имени М.В.Ломоносова


 





- создание сайта, 2010