bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-34_CDS_annotation_glimmer3.pl_2_2 Length=673 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 82.4 4e-14 gi|649569140|gb|KDS75238.1| capsid family protein 82.4 2e-13 gi|649555287|gb|KDS61824.1| capsid family protein 82.4 3e-13 gi|492501782|ref|WP_005867318.1| hypothetical protein 80.5 1e-12 gi|547920049|ref|WP_022322420.1| capsid protein VP1 78.2 7e-12 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 69.3 2e-09 gi|494308783|ref|WP_007173938.1| hypothetical protein 62.0 9e-07 gi|496521299|ref|WP_009229582.1| capsid protein 61.6 1e-06 gi|517172762|ref|WP_018361580.1| hypothetical protein 59.3 6e-06 gi|494306153|ref|WP_007173049.1| hypothetical protein 58.9 7e-06 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 82.4 bits (202), Expect = 4e-14, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%) Query 472 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 530 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 45 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 103 Query 531 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 586 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 104 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 159 Query 587 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 645 G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA + Sbjct 160 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 217 Query 646 VAAQNFWVQIAFNVEARRVMSAKVIPNL 673 + +WVQI +++A R+M P L Sbjct 218 TSDDKYWVQIYQDIKALRLMPKYGTPML 245 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 82.4 bits (202), Expect = 2e-13, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%) Query 472 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 530 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 190 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 248 Query 531 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 586 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 249 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 304 Query 587 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 645 G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA + Sbjct 305 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 362 Query 646 VAAQNFWVQIAFNVEARRVMSAKVIPNL 673 + +WVQI +++A R+M P L Sbjct 363 TSDDKYWVQIYQDIKALRLMPKYGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 82.4 bits (202), Expect = 3e-13, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%) Query 472 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 530 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 341 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 399 Query 531 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 586 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 400 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 455 Query 587 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 645 G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA + Sbjct 456 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 513 Query 646 VAAQNFWVQIAFNVEARRVMSAKVIPNL 673 + +WVQI +++A R+M P L Sbjct 514 TSDDKYWVQIYQDIKALRLMPKYGTPML 541 Score = 43.5 bits (101), Expect = 0.58, Method: Compositional matrix adjust. Identities = 22/74 (30%), Positives = 33/74 (45%), Gaps = 0/74 (0%) Query 18 TRLNNYNRSTHDLSRVIRTTAAPGTLIPTFKEKALPGDTFNIKIRSHILTHPTVGPLFGS 77 +L R+ +LS + T G LIP + +PGD F + + P V P+ Sbjct 8 VKLKRPRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67 Query 78 FKVQNDFFFCPDRL 91 V +FF P+RL Sbjct 68 VDVFTHYFFVPNRL 81 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 80.5 bits (197), Expect = 1e-12, Method: Compositional matrix adjust. Identities = 78/297 (26%), Positives = 130/297 (44%), Gaps = 24/297 (8%) Query 383 RPSCSYPLVGLALKTYQSDINTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKV 442 RP+ + LVG AL +D +L+ D N +D G S ++ L + + Sbjct 260 RPAGAMQLVGGALIAGGTD-------GAYLEPD---NFQVNVDELGVS--INDLRTSNAL 307 Query 443 YTMLNRIAISDGSYNAWIQTVYTSGGLN----HVETPIYLGGSSMEIEFQEVVNNSGTED 498 R A S Y I+ + + G+ ++ P +LGG I EV+ S T+ Sbjct 308 QRWFERNARSGSRY---IEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS 364 Query 499 -QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFCITSITPRVDYYQGNDWDLEIETLDDLH 557 P ++AG G++ G +Y +E GYI I SI PR Y QG D D + Sbjct 365 TSPQANMAGHGISAGVNHGFKRYF-EEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNMDFY 423 Query 558 KPQLDGIGFQDRLYKNINSSAKREDLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMC 617 P+ +G Q+ + + + G P + +Y S N +G+F N + Sbjct 424 FPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFR--GNMAFWH 481 Query 618 LNRVFGDIDTY-TTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 673 LNR+F + TT+++ + N +FA + + +W+Q+ +V+A R+M P L Sbjct 482 LNRIFSESPNLNTTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYGTPML 538 Score = 44.3 bits (103), Expect = 0.29, Method: Compositional matrix adjust. Identities = 22/74 (30%), Positives = 34/74 (46%), Gaps = 0/74 (0%) Query 18 TRLNNYNRSTHDLSRVIRTTAAPGTLIPTFKEKALPGDTFNIKIRSHILTHPTVGPLFGS 77 +L R+ +LS + TA G L+P + +PGD F + + P V P+ Sbjct 8 VKLKRPRRNVFNLSYENKLTANAGELVPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67 Query 78 FKVQNDFFFCPDRL 91 V +FF P+RL Sbjct 68 VDVFTHYFFVPNRL 81 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 78.2 bits (191), Expect = 7e-12, Method: Compositional matrix adjust. Identities = 61/204 (30%), Positives = 95/204 (47%), Gaps = 5/204 (2%) Query 472 VETPIYLGGSSMEIEFQEVVNNSGT-EDQPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 530 ++ P +LGG M I EV+ S T E P ++AG G++ G K+ +E GYI Sbjct 353 LQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYIIG 411 Query 531 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIGKQ 590 I SITPR Y QG D D + P+ + Q+ + + S + G Sbjct 412 IMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTFGYT 471 Query 591 PAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTDVAAQ 649 P + +Y + +G+F N + LNR+F D TT+++ N +FA ++ Sbjct 472 PRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSETEDD 529 Query 650 NFWVQIAFNVEARRVMSAKVIPNL 673 FWVQ+ +V+A R+M P L Sbjct 530 KFWVQMYQDVKALRLMPKYGTPML 553 Score = 49.3 bits (116), Expect = 0.009, Method: Compositional matrix adjust. Identities = 50/191 (26%), Positives = 79/191 (41%), Gaps = 19/191 (10%) Query 19 RLNNYNRSTHDLSRVIRTTAAPGTLIPTFKEKALPGDTFNIKIRSHILTHPTVGPLFGSF 78 R+ R+ +LS + T G L+P + GD F +K S + P V P+ Sbjct 9 RMKRPRRNAFNLSYESKLTLNMGELVPIMCMPVVSGDKFRVKTESLVRLAPLVAPMMHRV 68 Query 79 KVQNDFFFCPDRLYIAALHNNALNIGLNMKFIKYPIIGGIRKTLWDDSTTMEGSNGTLKR 138 V +FF P+RL + + + G++ + P+ I+ + + + S +K Sbjct 69 NVFTHYFFVPNRL-VWNEWEDFITKGVDGE--DMPMFPKIQI---NQDSHLVSSASLIKE 122 Query 139 EVNPSSLPAYLGYRALS----------NEIETPV-LTVKAIPFFGYFDIFKNYY--ANKQ 185 SSL YLG LS N ++ P V A+PF Y I+ YY N Sbjct 123 YFGDSSLWDYLGLPTLSACGNKSYDVVNGVKVPSGFQVSALPFRAYQLIYNEYYRDQNLT 182 Query 186 EEYFYTIGGVT 196 E +T+G T Sbjct 183 EPIDFTLGSGT 193 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 69.3 bits (168), Expect = 2e-09, Method: Compositional matrix adjust. Identities = 79/305 (26%), Positives = 123/305 (40%), Gaps = 53/305 (17%) Query 418 INSITAID---TSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVY-TSGGLNHVE 473 I + A+D ++G S + L L K+ ++R+ +S G +T++ T +V Sbjct 36 IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVN 95 Query 474 TPIYLGGSSMEIE---FQEVVNNSGT-EDQPLGSLAGRGVTD------NHKGGVIKYKPD 523 P +LG I + + N S + ED LG LA D H G I Y Sbjct 96 KPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAA--CVDRYCDFSGHSG--IDYYAK 151 Query 524 EPGYIFCITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQ-------DRLYKNINS 576 EPG IT + P Y QG DL + D P+L+GIGFQ + + N Sbjct 152 EPGTFMLITMLVPEPAYSQGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNF 211 Query 577 SAKREDL----------------IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNR 620 + ++ + S+G++ AW T ++R +G+FA N + L R Sbjct 212 TGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTR 271 Query 621 VF------------GDIDTYTTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAK 668 F D + TYI P + +F D + A NF F++ +SA Sbjct 272 RFTTYFPDDGTGFYQDGEYTGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSAN 331 Query 669 VIPNL 673 +P L Sbjct 332 YMPYL 336 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 62.0 bits (149), Expect = 9e-07, Method: Compositional matrix adjust. Identities = 61/238 (26%), Positives = 103/238 (43%), Gaps = 30/238 (13%) Query 425 DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI-------Y 477 D+S G F++ +L A V +L+ + ++ ++ Y VE P Y Sbjct 290 DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY 343 Query 478 LGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 530 LGG +++ +V SGT E +P LG +AG+G G I + E G + C Sbjct 344 LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC 401 Query 531 ITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIG 588 I S+ P++ Y D ++ LD D P+ + +G Q I+S + +G Sbjct 402 IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG 460 Query 589 KQPAWLDYMTSFNRNYGNFALIENEGWMCLNR-----VFGDIDTYTTYIQPHLYNNIF 641 QP + +Y T+ + N+G FA + ++R F ++ I P N+IF Sbjct 461 YQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSIF 518 >gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317] gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 317 str. F0108] Length=541 Score = 61.6 bits (148), Expect = 1e-06, Method: Compositional matrix adjust. Identities = 58/220 (26%), Positives = 98/220 (45%), Gaps = 30/220 (14%) Query 477 YLGGSSMEIEFQEVVNNSGTEDQP------------LGSLAGRGVTDNHKGGVIKYKPDE 524 YLGG ++ +V SGT + LG + G+G + G I++ E Sbjct 329 YLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGY--GEIQFDAKE 386 Query 525 PGYIFCITSITPRVDY-YQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDL 583 PG + CI S+ P + Y D + +T D P+ + +G Q + ++ + +++ Sbjct 387 PGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVSLNRAKDN- 445 Query 584 IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTYTTY------IQPHLY 637 S G QP + +Y T+F+ N+G FA E + + R G DT T+ I PH Sbjct 446 --SYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGS-DTLNTFNVAALKINPHWL 502 Query 638 NNIFA----DTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 673 +++FA T+V F FN+E M+ +P + Sbjct 503 DSVFAVNYNGTEVTDCMFGYA-HFNIEKVSDMTEDGMPRV 541 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 59.3 bits (142), Expect = 6e-06, Method: Compositional matrix adjust. Identities = 49/185 (26%), Positives = 81/185 (44%), Gaps = 20/185 (11%) Query 477 YLGGSSMEIEFQEVVNNSGT-----EDQPLGSLAGR--GVTDNHKGGVIKYKPDEPGYIF 529 Y+GG I+ +V +SGT +D G GR G G I++ E G + Sbjct 351 YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM 410 Query 530 CITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNI------NSSAKRE 581 CI S+ P V Y D ++ ++ D P+ + +G Q KNI N++ R Sbjct 411 CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI 469 Query 582 DLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD----IDTYTTYIQPHLY 637 + + G QP + +Y T+ + N+G F E + + R G+ + T I P Sbjct 470 KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL 529 Query 638 NNIFA 642 +++FA Sbjct 530 DDVFA 534 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 58.9 bits (141), Expect = 7e-06, Method: Compositional matrix adjust. Identities = 54/207 (26%), Positives = 92/207 (44%), Gaps = 25/207 (12%) Query 418 INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI- 476 +N +D + G F++ +L A V +L+ + ++ ++ Y VE P Sbjct 249 LNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDS 302 Query 477 ------YLGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPD 523 YLGG +++ +V SGT E +P LG +AG+G G I + Sbjct 303 RDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGR--GRIVFDAK 360 Query 524 EPGYIFCITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKRE 581 E G + CI S+ P++ Y D ++ LD D P+ + +G Q I+S + Sbjct 361 EHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYISSFCTPD 419 Query 582 DLIKSIGKQPAWLDYMTSFNRNYGNFA 608 +G QP + +Y T+ + N+G FA Sbjct 420 PKNPVLGYQPRYSEYKTALDINHGQFA 446 Lambda K H a alpha 0.318 0.136 0.405 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 5162679729312