bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-32_CDS_annotation_glimmer3.pl_2_4 Length=497 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 94.4 2e-18 gi|649569140|gb|KDS75238.1| capsid family protein 94.4 1e-17 gi|649555287|gb|KDS61824.1| capsid family protein 94.0 3e-17 gi|492501782|ref|WP_005867318.1| hypothetical protein 94.0 4e-17 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 87.4 1e-15 gi|547920049|ref|WP_022322420.1| capsid protein VP1 88.2 3e-15 gi|494610271|ref|WP_007368517.1| capsid protein 72.0 3e-10 gi|647452987|ref|WP_025792807.1| hypothetical protein 70.9 9e-10 gi|565841287|ref|WP_023924568.1| hypothetical protein 67.0 2e-08 gi|639237429|ref|WP_024568106.1| hypothetical protein 61.6 7e-07 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 94.4 bits (233), Expect = 2e-18, Method: Compositional matrix adjust. Identities = 76/246 (31%), Positives = 111/246 (45%), Gaps = 19/246 (8%) Query 253 LNRIAVSGGSYRDWLETVYASGQYIERCETPTFEGGTSQEIVFQEVVSNSATEN-EPLGT 311 R A SG Y + + + + R + P F GG I EV+ S+T++ P Sbjct 18 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQAN 77 Query 312 LAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDWISLDDMHKPALD 371 +AG GI+AG G E GY+M I SI PR Y QG D D + P Sbjct 78 MAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFA 135 Query 372 GIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTYGNFAAGMSEAFM 431 +G Q+ N +++Y+N + A + T G T + +Y + N +G+F M AF Sbjct 136 HLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNM--AFW 187 Query 432 VLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITARRLMSA 491 LNR ++ K P ++ TT+++ N +FA +WVQI DI A RLM Sbjct 188 HLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 239 Query 492 KQIPNL 497 P L Sbjct 240 YGTPML 245 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 94.4 bits (233), Expect = 1e-17, Method: Compositional matrix adjust. Identities = 76/246 (31%), Positives = 111/246 (45%), Gaps = 19/246 (8%) Query 253 LNRIAVSGGSYRDWLETVYASGQYIERCETPTFEGGTSQEIVFQEVVSNSATEN-EPLGT 311 R A SG Y + + + + R + P F GG I EV+ S+T++ P Sbjct 163 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQAN 222 Query 312 LAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDWISLDDMHKPALD 371 +AG GI+AG G E GY+M I SI PR Y QG D D + P Sbjct 223 MAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFA 280 Query 372 GIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTYGNFAAGMSEAFM 431 +G Q+ N +++Y+N + A + T G T + +Y + N +G+F M AF Sbjct 281 HLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNM--AFW 332 Query 432 VLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITARRLMSA 491 LNR ++ K P ++ TT+++ N +FA +WVQI DI A RLM Sbjct 333 HLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 384 Query 492 KQIPNL 497 P L Sbjct 385 YGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 94.0 bits (232), Expect = 3e-17, Method: Compositional matrix adjust. Identities = 76/246 (31%), Positives = 111/246 (45%), Gaps = 19/246 (8%) Query 253 LNRIAVSGGSYRDWLETVYASGQYIERCETPTFEGGTSQEIVFQEVVSNSATEN-EPLGT 311 R A SG Y + + + + R + P F GG I EV+ S+T++ P Sbjct 314 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQAN 373 Query 312 LAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDWISLDDMHKPALD 371 +AG GI+AG G E GY+M I SI PR Y QG D D + P Sbjct 374 MAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFA 431 Query 372 GIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTYGNFAAGMSEAFM 431 +G Q+ N +++Y+N + A + T G T + +Y + N +G+F M AF Sbjct 432 HLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNM--AFW 483 Query 432 VLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITARRLMSA 491 LNR ++ K P ++ TT+++ N +FA +WVQI DI A RLM Sbjct 484 HLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 535 Query 492 KQIPNL 497 P L Sbjct 536 YGTPML 541 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 94.0 bits (232), Expect = 4e-17, Method: Compositional matrix adjust. Identities = 77/268 (29%), Positives = 120/268 (45%), Gaps = 19/268 (7%) Query 231 VDVSDGTLSMDALNLQQKVYNMLNRIAVSGGSYRDWLETVYASGQYIERCETPTFEGGTS 290 V+V + +S++ L + R A SG Y + + + + R + P F GG Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348 Query 291 QEIVFQEVVSNSATEN-EPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDY 349 I EV+ SAT++ P +AG GI+AG G K E GY++ I SI PR Y Sbjct 349 TPISVSEVLQTSATDSTSPQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGY 406 Query 350 SQGNDFDSDWISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWI 409 QG D D + P +G Q+ N ++VY+ T + T G T + Sbjct 407 QQGVPKDFRKFDNMDFYFPEFAHLGEQEIKN------EEVYLQQTPASNNGTFGYTPRYA 460 Query 410 DYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTS 469 +Y ++N +G+F M AF LNR + +P ++ TT+++ N +FA Sbjct 461 EYKYSMNEVHGDFRGNM--AFWHLNRIFS------ESPNLN--TTFVECNPSNRVFATAE 510 Query 470 IDAMNFWVQIKFDITARRLMSAKQIPNL 497 +W+Q+ D+ A RLM P L Sbjct 511 TSDDKYWIQLYQDVKALRLMPKYGTPML 538 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 87.4 bits (215), Expect = 1e-15, Method: Compositional matrix adjust. Identities = 89/333 (27%), Positives = 138/333 (41%), Gaps = 41/333 (12%) Query 199 GLLLKTYNSDLYQNWINTEWIDGANGINEITAVDVSDGT---LSMDALNLQQKVYNMLNR 255 GLL Y+ DL+ N I + A I + A+D++ T +++ L L+ K+ N ++R Sbjct 11 GLLSVPYSPDLFGNIIK-QGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNWMDR 69 Query 256 IAVSGGSYRDWLETVYASGQYIERCETPTFEGGTSQEIVFQEVVSNSATENEPLGTLAGR 315 + VSGG D T++ + P F G V+Q ++ S G+ +G Sbjct 70 LFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPSNVRAMANGSASGE 123 Query 316 GINAGKQKG-----------GKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDWISLDD 364 N G+ I A EPG M IT + P YSQG D IS D Sbjct 124 DANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQGLHPDLASISFGD 183 Query 365 MHKPALDGIGYQDSVN------------SGRAWWDDVYINNTGKAA-----KRTAGKTVA 407 P L+GIG+Q +G + +TG + G+ VA Sbjct 184 DFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMVSVGEEVA 243 Query 408 WIDYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISD---LTTYIDPVKYNYI 464 W T+ +R +G+FA + + VL R + + D TYI+P+ + Y+ Sbjct 244 WSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYINPLDWQYV 303 Query 465 FADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 497 F D ++ A NF FD+ +SA +P L Sbjct 304 FVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 88.2 bits (217), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 79/275 (29%), Positives = 121/275 (44%), Gaps = 23/275 (8%) Query 226 NEITAVDVSDGTLSMDALNLQQKVYNMLNRIAVSGGSYRDWLETVYASGQYIERCETPTF 285 N V+V + ++++ L + R A G Y + + + + R + P F Sbjct 299 NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF 358 Query 286 EGGTSQEIVFQEVVSNSAT-ENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSIT 344 GG I EV+ S+T E P +AG GI+AG G K E GY++ I SIT Sbjct 359 LGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSIT 416 Query 345 PRIDYSQGNDFDSDWISLDDM--HKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTA 402 PR Y QG D+ D+M + P + Q+ N +D NN T Sbjct 417 PRSGYQQG--VPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNG------TF 468 Query 403 GKTVAWIDYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYN 462 G T + +Y + + +G+F +S F LNR +E K P ++ TT+++ N Sbjct 469 GYTPRYAEYKYHPSEAHGDFRGNLS--FWHLNRIFEDK------PNLN--TTFVECKPSN 518 Query 463 YIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 497 +FA + + FWVQ+ D+ A RLM P L Sbjct 519 RVFATSETEDDKFWVQMYQDVKALRLMPKYGTPML 553 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 72.0 bits (175), Expect = 3e-10, Method: Compositional matrix adjust. Identities = 77/305 (25%), Positives = 126/305 (41%), Gaps = 32/305 (10%) Query 222 ANGINEITAVDVSDGTLSMDALNLQQKVYNMLNRIAVSGG-SYRDWLETVYASGQYIERC 280 +NG++ + LS++ L + ML + G Y +E + R Sbjct 230 SNGVSGASTFINGVSVLSVNDLRAAFALDKMLEATRRANGLDYSSQIEAHFGFRVPESRA 289 Query 281 ETPTFEGGTSQEIVFQEVVSNS-----ATENEPLGTLAGRGINAGKQKGGKIKIRATEPG 335 F GG +V EVV+ S A E+ LG L G+G+ G I E G Sbjct 290 GDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGV--GSLNSSSIDFDVKEHG 347 Query 336 YMMCITSITPRIDYSQGNDFD--SDWISLDDMHKPALDGIGYQDSVNSG--RAWWDDVYI 391 +MCI S+ P+ +Y G FD + + +D +P +GYQ V S + D+ Sbjct 348 IIMCIYSVVPQTEY-NGTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDNPVP 406 Query 392 NNTGKAAKRTAGKTVAWID--------------YMTNVNRTYGNFAAGMSEAFMVLNR-N 436 + K + AG ++ I+ Y T+ + +G F +G+S ++ R + Sbjct 407 DGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLSYWCSPRYD 466 Query 437 YEMKYEAG----TNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAK 492 + +AG N S Y++P N IF +++ A +F V FD+ A R MS Sbjct 467 FGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDVKAVRPMSVS 526 Query 493 QIPNL 497 + L Sbjct 527 GLAGL 531 >gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola] Length=584 Score = 70.9 bits (172), Expect = 9e-10, Method: Compositional matrix adjust. Identities = 76/302 (25%), Positives = 127/302 (42%), Gaps = 47/302 (16%) Query 233 VSDGTLSMDALNLQQKVYNMLNRIAVSGG-SYRDWLETVYASGQYIERCETPTFEGGTSQ 291 VS + S++ L + ML + G Y +E + R F GG Sbjct 293 VSPSSFSVNDLRAAFALDKMLEATRRANGLDYASQIEAHFGFKVPESRANDARFLGGFDN 352 Query 292 EIVFQEVVS---NSATE--NEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPR 346 IV EVVS N+A++ + +G L G+GI G G I+ +TE G +MCI S+ P+ Sbjct 353 SIVVSEVVSTNGNAASDGSHASIGDLGGKGI--GSMSSGTIEFDSTEHGIIMCIYSVAPQ 410 Query 347 IDY--SQGNDFDSDWISLDDMHKPALDGIGYQD-----------SVNSGRAWWDDVYINN 393 +Y S + F+ ++ + ++P +GYQ +N +A + D+ +NN Sbjct 411 SEYNASYLDPFNRK-LTREQFYQPEFADLGYQALIGSDLICSTLGMNEKQAGFSDIELNN 469 Query 394 TGKAAKRTAGKTVAWIDYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKY------------ 441 G V + +Y T + +G+F +G S ++ R ++ Y Sbjct 470 N------LLGYQVRYNEYKTARDLVFGDFESGKSLSYWCTPR-FDFGYGDTEKKIAPENK 522 Query 442 ------EAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIP 495 + G S YI+P N IF +++ A +F V D+ A R MS + Sbjct 523 GGADYRKKGNRSHWSSRNFYINPNLVNPIFLTSAVQADHFIVNSFLDVKAVRPMSVTGLS 582 Query 496 NL 497 +L Sbjct 583 SL 584 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 67.0 bits (162), Expect = 2e-08, Method: Compositional matrix adjust. Identities = 72/283 (25%), Positives = 120/283 (42%), Gaps = 32/283 (11%) Query 221 GANGINEITAVDVSDGTLSMDALNLQQKVYNMLNRI-AVSGGSYRDWLETVYASGQYIER 279 G+N I+ I+ D+ +M AL ML R A +G Y + + + R Sbjct 384 GSNNISYISPNDIR----AMFALE------KMLERTRAANGLDYSNQIAAHFGFKVPESR 433 Query 280 CETPTFEGGTSQEIVFQEVVSNS-------ATENEPLGTLAGRGINAGKQKGGKIKIRAT 332 +F GG +I EVV+ S A+ +G + G+GI G G I Sbjct 434 KNCASFIGGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGI--GAMNSGHISYDVK 491 Query 333 EPGYMMCITSITPRIDY--SQGNDFDSDWISLDDMHKPALDGIGYQDSVNSGRAWWDDVY 390 E G +MCI SI P++DY + + F+ + S +D +P + +G Q + S + Sbjct 492 EHGLIMCIYSIAPQVDYDARELDPFNRKF-SREDYFQPEFENLGMQPVIQSDLCLCINSA 550 Query 391 INNTGKAAKRTAGKTVAWIDYMTNVNRTYGNFAAGMS-EAFMVLNRNYEMKYEAGTNPKI 449 +++ G + +++Y T + +G F +G S A+ NY ++ + P + Sbjct 551 KSDSSDQHNNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPKNNYTFEFGKLSLPDL 610 Query 450 SDLTTYIDPVKYNYIFA---DTSIDAMNFWVQIKFDITARRLM 489 +DP IFA + S+ F V FD+ A R M Sbjct 611 -----LVDPKVLEPIFAVKYNGSMSTDQFLVNSYFDVKAIRPM 648 >gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis] Length=546 Score = 61.6 bits (148), Expect = 7e-07, Method: Compositional matrix adjust. Identities = 62/245 (25%), Positives = 108/245 (44%), Gaps = 17/245 (7%) Query 248 KVYNMLNRIAVSGGSYRDWLETVYASGQYIERCETPTFEGGTSQEIVFQEVVSNSATEN- 306 K+ L + A +G Y + + + + R + P F GG I+ EV+ S+T++ Sbjct 308 KLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKTPILISEVLQQSSTDST 367 Query 307 EPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDWISLDDMH 366 P G +AG GI+ GK+ GG K E GY++ + S+ P+ YSQG D Sbjct 368 TPQGNMAGHGISVGKE-GGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHFSKFDKFDYF 425 Query 367 KPALDGIGYQDSVNSGRAWWDDVYINNTGKA-AKRTAGKTVAWIDYMTNVNRTYGNFAAG 425 P + IG Q N +++ N G + G + +Y + + +G+F Sbjct 426 WPQFEHIGEQPVYNK------EIFAKNVGDYDSGGVFGYVPRYSEYKYSPSTIHGDFKDT 479 Query 426 MSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQIKFDITA 485 + F L R +++ PK++ ++ + IFA ++ F+ + ITA Sbjct 480 L--YFWHLGR----IFDSSAPPKLNRDFIEVNKSGLSRIFA-VEDNSDKFYCHLYQKITA 532 Query 486 RRLMS 490 +R MS Sbjct 533 KRKMS 537 Lambda K H a alpha 0.315 0.132 0.394 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 3489190903935