bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-26_CDS_annotation_glimmer3.pl_2_1 Length=567 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 82.4 3e-14 gi|649569140|gb|KDS75238.1| capsid family protein 82.8 1e-13 gi|649555287|gb|KDS61824.1| capsid family protein 82.8 2e-13 gi|492501782|ref|WP_005867318.1| hypothetical protein 80.5 1e-12 gi|547920049|ref|WP_022322420.1| capsid protein VP1 78.6 4e-12 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 68.6 2e-09 gi|494308783|ref|WP_007173938.1| hypothetical protein 61.6 8e-07 gi|496521299|ref|WP_009229582.1| capsid protein 61.6 9e-07 gi|494306153|ref|WP_007173049.1| hypothetical protein 59.3 4e-06 gi|517172762|ref|WP_018361580.1| hypothetical protein 59.7 4e-06 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 82.4 bits (202), Expect = 3e-14, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%) Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 45 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 103 Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 104 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 159 Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 539 G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA + Sbjct 160 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 217 Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567 + +WVQI +++A R+M P L Sbjct 218 TSDDKYWVQIYQDIKALRLMPKYGTPML 245 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 82.8 bits (203), Expect = 1e-13, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 100/208 (48%), Gaps = 13/208 (6%) Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 190 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 248 Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 249 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 304 Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD-IDTYTTYIQPHLYNNIFADTD 539 G P + +Y S N +G+F N + LNR+F + + TT+++ + N +FA + Sbjct 305 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 362 Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567 + +WVQI +++A R+M P L Sbjct 363 TSDDKYWVQIYQDIKALRLMPKYGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 82.8 bits (203), Expect = 2e-13, Method: Compositional matrix adjust. Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%) Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424 ++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI Sbjct 341 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 399 Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480 I SI PR Y QG D D + P+ +G Q+ LY N + +A + Sbjct 400 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 455 Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 539 G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA + Sbjct 456 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 513 Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567 + +WVQI +++A R+M P L Sbjct 514 TSDDKYWVQIYQDIKALRLMPKYGTPML 541 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 80.5 bits (197), Expect = 1e-12, Method: Compositional matrix adjust. Identities = 78/297 (26%), Positives = 130/297 (44%), Gaps = 24/297 (8%) Query 277 RPSCSYPLVGLALKTYQSDINTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKV 336 RP+ + LVG AL +D +L+ D N +D G S ++ L + + Sbjct 260 RPAGAMQLVGGALIAGGTD-------GAYLEPD---NFQVNVDELGVS--INDLRTSNAL 307 Query 337 YTMLNRIAISDGSYNAWIQTVYTSGGLN----HVETPIYLGGSSMEIEFQEVVNNSGTED 392 R A S Y I+ + + G+ ++ P +LGG I EV+ S T+ Sbjct 308 QRWFERNARSGSRY---IEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS 364 Query 393 -QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFCITSITPRVDYYQGNDWDLEIETLDDLH 451 P ++AG G++ G +Y +E GYI I SI PR Y QG D D + Sbjct 365 TSPQANMAGHGISAGVNHGFKRYF-EEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNMDFY 423 Query 452 KPQLDGIGFQDRLYKNINSSAKREDLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMC 511 P+ +G Q+ + + + G P + +Y S N +G+F N + Sbjct 424 FPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFR--GNMAFWH 481 Query 512 LNRVFGDIDTY-TTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 567 LNR+F + TT+++ + N +FA + + +W+Q+ +V+A R+M P L Sbjct 482 LNRIFSESPNLNTTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYGTPML 538 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 78.6 bits (192), Expect = 4e-12, Method: Compositional matrix adjust. Identities = 61/205 (30%), Positives = 95/205 (46%), Gaps = 5/205 (2%) Query 365 HVETPIYLGGSSMEIEFQEVVNNSGT-EDQPLGSLAGRGVTDNHKGGVIKYKPDEPGYIF 423 ++ P +LGG M I EV+ S T E P ++AG G++ G K+ +E GYI Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYII 410 Query 424 CITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIGK 483 I SITPR Y QG D D + P+ + Q+ + + S + G Sbjct 411 GIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTFGY 470 Query 484 QPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTDVAA 542 P + +Y + +G+F N + LNR+F D TT+++ N +FA ++ Sbjct 471 TPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSETED 528 Query 543 QNFWVQIAFNVEARRVMSAKVIPNL 567 FWVQ+ +V+A R+M P L Sbjct 529 DKFWVQMYQDVKALRLMPKYGTPML 553 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 68.6 bits (166), Expect = 2e-09, Method: Compositional matrix adjust. Identities = 79/305 (26%), Positives = 123/305 (40%), Gaps = 53/305 (17%) Query 312 INSITAID---TSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVY-TSGGLNHVE 367 I + A+D ++G S + L L K+ ++R+ +S G +T++ T +V Sbjct 36 IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVN 95 Query 368 TPIYLGGSSMEIE---FQEVVNNSGT-EDQPLGSLAGRGVTD------NHKGGVIKYKPD 417 P +LG I + + N S + ED LG LA D H G I Y Sbjct 96 KPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAA--CVDRYCDFSGHSG--IDYYAK 151 Query 418 EPGYIFCITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQ-------DRLYKNINS 470 EPG IT + P Y QG DL + D P+L+GIGFQ + + N Sbjct 152 EPGTFMLITMLVPEPAYSQGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNF 211 Query 471 SAKREDL----------------IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNR 514 + ++ + S+G++ AW T ++R +G+FA N + L R Sbjct 212 TGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTR 271 Query 515 VF------------GDIDTYTTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAK 562 F D + TYI P + +F D + A NF F++ +SA Sbjct 272 RFTTYFPDDGTGFYQDGEYTGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSAN 331 Query 563 VIPNL 567 +P L Sbjct 332 YMPYL 336 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 61.6 bits (148), Expect = 8e-07, Method: Compositional matrix adjust. Identities = 61/238 (26%), Positives = 103/238 (43%), Gaps = 30/238 (13%) Query 319 DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI-------Y 371 D+S G F++ +L A V +L+ + ++ ++ Y VE P Y Sbjct 290 DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY 343 Query 372 LGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424 LGG +++ +V SGT E +P LG +AG+G G I + E G + C Sbjct 344 LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC 401 Query 425 ITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIG 482 I S+ P++ Y D ++ LD D P+ + +G Q I+S + +G Sbjct 402 IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG 460 Query 483 KQPAWLDYMTSFNRNYGNFALIENEGWMCLNR-----VFGDIDTYTTYIQPHLYNNIF 535 QP + +Y T+ + N+G FA + ++R F ++ I P N+IF Sbjct 461 YQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSIF 518 >gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317] gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 317 str. F0108] Length=541 Score = 61.6 bits (148), Expect = 9e-07, Method: Compositional matrix adjust. Identities = 58/220 (26%), Positives = 98/220 (45%), Gaps = 30/220 (14%) Query 371 YLGGSSMEIEFQEVVNNSGTEDQP------------LGSLAGRGVTDNHKGGVIKYKPDE 418 YLGG ++ +V SGT + LG + G+G + G I++ E Sbjct 329 YLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGY--GEIQFDAKE 386 Query 419 PGYIFCITSITPRVDY-YQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDL 477 PG + CI S+ P + Y D + +T D P+ + +G Q + ++ + +++ Sbjct 387 PGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVSLNRAKDN- 445 Query 478 IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTYTTY------IQPHLY 531 S G QP + +Y T+F+ N+G FA E + + R G DT T+ I PH Sbjct 446 --SYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGS-DTLNTFNVAALKINPHWL 502 Query 532 NNIFA----DTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 567 +++FA T+V F FN+E M+ +P + Sbjct 503 DSVFAVNYNGTEVTDCMFGYA-HFNIEKVSDMTEDGMPRV 541 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 59.3 bits (142), Expect = 4e-06, Method: Compositional matrix adjust. Identities = 54/207 (26%), Positives = 92/207 (44%), Gaps = 25/207 (12%) Query 312 INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI- 370 +N +D + G F++ +L A V +L+ + ++ ++ Y VE P Sbjct 249 LNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDS 302 Query 371 ------YLGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPD 417 YLGG +++ +V SGT E +P LG +AG+G G I + Sbjct 303 RDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGR--GRIVFDAK 360 Query 418 EPGYIFCITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKRE 475 E G + CI S+ P++ Y D ++ LD D P+ + +G Q I+S + Sbjct 361 EHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYISSFCTPD 419 Query 476 DLIKSIGKQPAWLDYMTSFNRNYGNFA 502 +G QP + +Y T+ + N+G FA Sbjct 420 PKNPVLGYQPRYSEYKTALDINHGQFA 446 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 59.7 bits (143), Expect = 4e-06, Method: Compositional matrix adjust. Identities = 49/185 (26%), Positives = 81/185 (44%), Gaps = 20/185 (11%) Query 371 YLGGSSMEIEFQEVVNNSGT-----EDQPLGSLAGR--GVTDNHKGGVIKYKPDEPGYIF 423 Y+GG I+ +V +SGT +D G GR G G I++ E G + Sbjct 351 YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM 410 Query 424 CITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNI------NSSAKRE 475 CI S+ P V Y D ++ ++ D P+ + +G Q KNI N++ R Sbjct 411 CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI 469 Query 476 DLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD----IDTYTTYIQPHLY 531 + + G QP + +Y T+ + N+G F E + + R G+ + T I P Sbjct 470 KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL 529 Query 532 NNIFA 536 +++FA Sbjct 530 DDVFA 534 Lambda K H a alpha 0.317 0.136 0.403 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4166738442540