bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-31_CDS_annotation_glimmer3.pl_2_2 Length=322 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 83.6 1e-15 gi|492501782|ref|WP_005867318.1| hypothetical protein 84.7 6e-15 gi|649569140|gb|KDS75238.1| capsid family protein 83.6 8e-15 gi|649555287|gb|KDS61824.1| capsid family protein 83.2 2e-14 gi|547920049|ref|WP_022322420.1| capsid protein VP1 82.0 4e-14 gi|494610271|ref|WP_007368517.1| capsid protein 81.6 6e-14 gi|494308783|ref|WP_007173938.1| hypothetical protein 75.5 7e-12 gi|494306153|ref|WP_007173049.1| hypothetical protein 74.7 1e-11 gi|609718276|emb|CDN73650.1| conserved hypothetical protein 74.3 2e-11 gi|496521299|ref|WP_009229582.1| capsid protein 71.6 1e-10 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 83.6 bits (205), Expect = 1e-15, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 18 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 73 Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 74 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 130 Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 131 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 183 Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 184 MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 239 Query 317 REIPNL 322 P L Sbjct 240 YGTPML 245 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 84.7 bits (208), Expect = 6e-15, Method: Compositional matrix adjust. Identities = 72/246 (29%), Positives = 115/246 (47%), Gaps = 20/246 (8%) Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137 R A SG Y + + FGVR S A + P ++GG + I EV+ T+A +S Sbjct 311 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS----TS 366 Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196 P ++AG G G K EE I+ + SI PR Y QG K + + D M DF+ Sbjct 367 PQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 423 Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256 P +G QE+ ++E++ + + + + G P + EY ++NE +GDF Sbjct 424 FPEFAHLGEQEIKNEEVY-----LQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFRGN-- 476 Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316 + + NR++ + P L TT+++ N+ FA + +WIQ+ DV R+ Sbjct 477 MAFWHLNRIF--SESPNLN--TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK 532 Query 317 REIPNL 322 P L Sbjct 533 YGTPML 538 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 83.6 bits (205), Expect = 8e-15, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 163 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 218 Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 219 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 275 Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 276 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 328 Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 329 MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 384 Query 317 REIPNL 322 P L Sbjct 385 YGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 83.2 bits (204), Expect = 2e-14, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 314 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 369 Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 370 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 426 Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 427 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 479 Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 480 MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 535 Query 317 REIPNL 322 P L Sbjct 536 YGTPML 541 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 82.0 bits (201), Expect = 4e-14, Method: Compositional matrix adjust. Identities = 73/246 (30%), Positives = 115/246 (47%), Gaps = 20/246 (8%) Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137 R A G Y + + FGVR S A + P ++GG I EV+ T++ + ET Sbjct 326 FERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGRMPISVSEVLQTSS--TDETS-- 381 Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196 P ++AG G G K EE I+ + SI PR Y QG + +T+ D M DF+ Sbjct 382 PQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQGVPRDFTKFDNM-DFY 438 Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256 P + QE+ +QE+ V+ + + + G P + EY +E +GDF Sbjct 439 FPEFAHLSEQEIKNQELF-----VSEDAAYNNGTFGYTPRYAEYKYHPSEAHGDFRGN-- 491 Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316 L + NR+++ + P L TT+++ + N+ FA S + FW+Q+ DV R+ Sbjct 492 LSFWHLNRIFE--DKPNLN--TTFVECKPSNRVFATSETEDDKFWVQMYQDVKALRLMPK 547 Query 317 REIPNL 322 P L Sbjct 548 YGTPML 553 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 81.6 bits (200), Expect = 6e-14, Method: Compositional matrix adjust. Identities = 74/272 (27%), Positives = 117/272 (43%), Gaps = 44/272 (16%) Query 86 GGSYRDWQEAVFGVRV--SRAAESPIYVGGYASEIVFDEVVSTAAFESGETGQEPLGSLA 143 G Y EA FG RV SRA ++ ++GG+ + +V EVV+ + F+ G LG L Sbjct 269 GLDYSSQIEAHFGFRVPESRAGDA-RFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLG 327 Query 144 GRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTMN------DFHK 197 G+G +I +E +IM + S+VP+ +Y+ T D N DF + Sbjct 328 GKG--VGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNG-----TYFDPFNRKLRREDFFQ 380 Query 198 PNLDQIGFQELLSQEMHG------------RAWRVNANYKTTDFS-----VGKQPAWTEY 240 P +G+Q +++ ++ + R+ A Y + +G Q + EY Sbjct 381 PEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEY 440 Query 241 TTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKD----------ATTYIDPQIFNKAF 290 T+ + +G+F +G L Y R YD D K D A Y++P I N F Sbjct 441 KTSRDLVFGEFESGLSLSYWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIF 499 Query 291 ANSNLDAKNFWIQIGFDVIGRRVKSAREIPNL 322 S + A +F + FDV R S + L Sbjct 500 LVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 75.5 bits (184), Expect = 7e-12, Method: Compositional matrix adjust. Identities = 67/259 (26%), Positives = 120/259 (46%), Gaps = 30/259 (12%) Query 46 DGTN--GINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSR 103 DG+N +N D + G ++ +L V +L+ +G +++D A +GV + Sbjct 276 DGSNFTRVNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPD 335 Query 104 AAESPI-YVGGYASEIVFDEVVSTAAFESGETGQEP--LGSLAGRGRETSKRGGKNIKIR 160 + + + Y+GG+ S++ +V T+ + E E LG +AG+G + G I Sbjct 336 SRDGRVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKG---TGSGRGRIVFD 392 Query 161 CEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM------NDFHKPNLDQIGFQELLSQEMH 214 +E ++M + S+VP++ Y TR+D M D+ P + +G Q L S + Sbjct 393 AKEHGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYI- 446 Query 215 GRAWRVNANYKTTDFS---VGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVAND 271 +++ TTD +G QP ++EY T ++ +G FA + L + +R Sbjct 447 -------SSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTF 499 Query 272 PKLKDATTYIDPQIFNKAF 290 P+L+ A IDP N F Sbjct 500 PQLEIADFKIDPGCLNSIF 518 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 74.7 bits (182), Expect = 1e-11, Method: Compositional matrix adjust. Identities = 71/285 (25%), Positives = 126/285 (44%), Gaps = 32/285 (11%) Query 22 SQCGLGIRTYLSDRFNNWLNTEWIDGTNG----INEITSVDVTSGLLTMDALILQKKVYD 77 SQ I + D N+ ++ D + +N VD G ++ +L V Sbjct 216 SQLFTFIPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDK 275 Query 78 MLNRIAVSGGSYRDWQEAVFGVRVSRAAESPI-YVGGYASEIVFDEVVSTAAFESGETGQ 136 +L+ +G +++D A +GV + + + + Y+GG+ S++ +V T+ + E Sbjct 276 LLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP 335 Query 137 EP--LGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM-- 192 E LG +AG+G + G I +E ++M + S+VP++ Y TR+D M Sbjct 336 EAGYLGRIAGKG---TGSGRGRIVFDAKEHGVLMCIYSLVPQIQYD-----CTRLDPMVD 387 Query 193 ----NDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFS---VGKQPAWTEYTTTVN 245 DF P + +G Q L S + +++ T D +G QP ++EY T ++ Sbjct 388 KLDRFDFFTPEFENLGMQPLNSSYI--------SSFCTPDPKNPVLGYQPRYSEYKTALD 439 Query 246 ETYGDFAAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAF 290 +G FA + L + +R P+L+ A IDP N F Sbjct 440 INHGQFAQNDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSVF 484 >gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis] Length=537 Score = 74.3 bits (181), Expect = 2e-11, Method: Compositional matrix adjust. Identities = 62/244 (25%), Positives = 114/244 (47%), Gaps = 16/244 (7%) Query 74 KVYDMLNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESG 132 K+ + L + A +G Y + + FGV+ S + P ++GG S I+ EV+ +A +S Sbjct 299 KLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMISEVLQQSATDS- 357 Query 133 ETGQEPLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDT 191 P G++AG G K GG EE ++ L S++P+ YSQG + +++ D Sbjct 358 ---TTPQGNMAGHGIGIGKDGG--FSRFFEEHGYVIGLMSVIPKTSYSQGIPRHFSKSDK 412 Query 192 MNDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDF 251 D+ P + IG Q + ++E+ + N + ++ G P ++EY + + +GDF Sbjct 413 F-DYFWPQFEHIGEQPVYNKEIFAK----NIDAFDSEAVFGYLPRYSEYKFSPSTVHGDF 467 Query 252 AAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGR 311 + L + R++D P L + D ++ FA + D F+ + + + Sbjct 468 K--DDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVED-DTDKFYCHLYQKITAK 524 Query 312 RVKS 315 R S Sbjct 525 RKMS 528 >gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317] gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 317 str. F0108] Length=541 Score = 71.6 bits (174), Expect = 1e-10, Method: Compositional matrix adjust. Identities = 70/257 (27%), Positives = 110/257 (43%), Gaps = 28/257 (11%) Query 46 DGTNGINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRAA 105 DG + + S DV + A L K +L+ +G +Y + EA FGV VS Sbjct 268 DGNSAKLNMASPDVLNVSAIRSAFALDK----LLSISMRAGKTYAEQIEAHFGVTVSEGR 323 Query 106 ESPIY-VGGYASEIVFDEVVSTAAFESGETGQEPLGSLAGR-GRETSKRGGK---NIKIR 160 + +Y +GG+ S + +V T+ + + LAG G+ T K G I+ Sbjct 324 DGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGEIQFD 383 Query 161 CEEPSLIMILGSIVPRVDYSQGNKWWTRVD------TMNDFHKPNLDQIGFQELLSQEMH 214 +EP ++M + S+VP + Y R+D T D+ P + +G Q ++ Sbjct 384 AKEPGVLMCIYSVVPAMQYD-----CMRLDPFVAKQTRGDYFIPEFENLGMQPIVPA--- 435 Query 215 GRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKL 274 V+ N + D S G QP ++EY T + +G FA GEPL Y + R Sbjct 436 ----FVSLN-RAKDNSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTF 490 Query 275 KDATTYIDPQIFNKAFA 291 A I+P + FA Sbjct 491 NVAALKINPHWLDSVFA 507 Lambda K H a alpha 0.317 0.134 0.404 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 1793877651450