bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters



Query= Contig-26_CDS_annotation_glimmer3.pl_2_1

Length=567
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|649557305|gb|KDS63784.1|  capsid family protein                    82.4    3e-14
gi|649569140|gb|KDS75238.1|  capsid family protein                    82.8    1e-13
gi|649555287|gb|KDS61824.1|  capsid family protein                    82.8    2e-13
gi|492501782|ref|WP_005867318.1|  hypothetical protein                80.5    1e-12
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  78.6    4e-12
gi|547312923|ref|WP_022044635.1|  putative uncharacterized protein    68.6    2e-09
gi|494308783|ref|WP_007173938.1|  hypothetical protein                61.6    8e-07
gi|496521299|ref|WP_009229582.1|  capsid protein                      61.6    9e-07
gi|494306153|ref|WP_007173049.1|  hypothetical protein                59.3    4e-06
gi|517172762|ref|WP_018361580.1|  hypothetical protein                59.7    4e-06


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 82.4 bits (202),  Expect = 3e-14, Method: Compositional matrix adjust.
 Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%)

Query  366  VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC  424
            ++ P +LGG    I   EV+  S T+   P  ++AG G++     G  +Y  +E GYI  
Sbjct  45   LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG  103

Query  425  ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS  480
            I SI PR  Y QG   D       D + P+   +G Q+     LY N + +A       +
Sbjct  104  IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT  159

Query  481  IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD  539
             G  P + +Y  S N  +G+F    N  +  LNR+F +     TT+++ +  N +FA  +
Sbjct  160  FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE  217

Query  540  VAAQNFWVQIAFNVEARRVMSAKVIPNL  567
             +   +WVQI  +++A R+M     P L
Sbjct  218  TSDDKYWVQIYQDIKALRLMPKYGTPML  245


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 82.8 bits (203),  Expect = 1e-13, Method: Compositional matrix adjust.
 Identities = 62/208 (30%), Positives = 100/208 (48%), Gaps = 13/208 (6%)

Query  366  VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC  424
            ++ P +LGG    I   EV+  S T+   P  ++AG G++     G  +Y  +E GYI  
Sbjct  190  LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG  248

Query  425  ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS  480
            I SI PR  Y QG   D       D + P+   +G Q+     LY N + +A       +
Sbjct  249  IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT  304

Query  481  IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD-IDTYTTYIQPHLYNNIFADTD  539
             G  P + +Y  S N  +G+F    N  +  LNR+F +  +  TT+++ +  N +FA  +
Sbjct  305  FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE  362

Query  540  VAAQNFWVQIAFNVEARRVMSAKVIPNL  567
             +   +WVQI  +++A R+M     P L
Sbjct  363  TSDDKYWVQIYQDIKALRLMPKYGTPML  390


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 82.8 bits (203),  Expect = 2e-13, Method: Compositional matrix adjust.
 Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%)

Query  366  VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC  424
            ++ P +LGG    I   EV+  S T+   P  ++AG G++     G  +Y  +E GYI  
Sbjct  341  LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG  399

Query  425  ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS  480
            I SI PR  Y QG   D       D + P+   +G Q+     LY N + +A       +
Sbjct  400  IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT  455

Query  481  IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD  539
             G  P + +Y  S N  +G+F    N  +  LNR+F +     TT+++ +  N +FA  +
Sbjct  456  FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE  513

Query  540  VAAQNFWVQIAFNVEARRVMSAKVIPNL  567
             +   +WVQI  +++A R+M     P L
Sbjct  514  TSDDKYWVQIYQDIKALRLMPKYGTPML  541


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 80.5 bits (197),  Expect = 1e-12, Method: Compositional matrix adjust.
 Identities = 78/297 (26%), Positives = 130/297 (44%), Gaps = 24/297 (8%)

Query  277  RPSCSYPLVGLALKTYQSDINTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKV  336
            RP+ +  LVG AL    +D         +L+ D   N    +D  G S  ++ L  +  +
Sbjct  260  RPAGAMQLVGGALIAGGTD-------GAYLEPD---NFQVNVDELGVS--INDLRTSNAL  307

Query  337  YTMLNRIAISDGSYNAWIQTVYTSGGLN----HVETPIYLGGSSMEIEFQEVVNNSGTED  392
                 R A S   Y   I+ + +  G+      ++ P +LGG    I   EV+  S T+ 
Sbjct  308  QRWFERNARSGSRY---IEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS  364

Query  393  -QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFCITSITPRVDYYQGNDWDLEIETLDDLH  451
              P  ++AG G++     G  +Y  +E GYI  I SI PR  Y QG   D       D +
Sbjct  365  TSPQANMAGHGISAGVNHGFKRYF-EEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNMDFY  423

Query  452  KPQLDGIGFQDRLYKNINSSAKREDLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMC  511
             P+   +G Q+   + +           + G  P + +Y  S N  +G+F    N  +  
Sbjct  424  FPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFR--GNMAFWH  481

Query  512  LNRVFGDIDTY-TTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAKVIPNL  567
            LNR+F +     TT+++ +  N +FA  + +   +W+Q+  +V+A R+M     P L
Sbjct  482  LNRIFSESPNLNTTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYGTPML  538


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 78.6 bits (192),  Expect = 4e-12, Method: Compositional matrix adjust.
 Identities = 61/205 (30%), Positives = 95/205 (46%), Gaps = 5/205 (2%)

Query  365  HVETPIYLGGSSMEIEFQEVVNNSGT-EDQPLGSLAGRGVTDNHKGGVIKYKPDEPGYIF  423
             ++ P +LGG  M I   EV+  S T E  P  ++AG G++     G  K+  +E GYI 
Sbjct  352  RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYII  410

Query  424  CITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIGK  483
             I SITPR  Y QG   D       D + P+   +  Q+   + +  S        + G 
Sbjct  411  GIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTFGY  470

Query  484  QPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTDVAA  542
             P + +Y    +  +G+F    N  +  LNR+F D     TT+++    N +FA ++   
Sbjct  471  TPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSETED  528

Query  543  QNFWVQIAFNVEARRVMSAKVIPNL  567
              FWVQ+  +V+A R+M     P L
Sbjct  529  DKFWVQMYQDVKALRLMPKYGTPML  553


>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
 gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338

 Score = 68.6 bits (166),  Expect = 2e-09, Method: Compositional matrix adjust.
 Identities = 79/305 (26%), Positives = 123/305 (40%), Gaps = 53/305 (17%)

Query  312  INSITAID---TSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVY-TSGGLNHVE  367
            I  + A+D   ++G S  +  L L  K+   ++R+ +S G      +T++ T     +V 
Sbjct  36   IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVN  95

Query  368  TPIYLGGSSMEIE---FQEVVNNSGT-EDQPLGSLAGRGVTD------NHKGGVIKYKPD  417
             P +LG     I     + + N S + ED  LG LA     D       H G  I Y   
Sbjct  96   KPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAA--CVDRYCDFSGHSG--IDYYAK  151

Query  418  EPGYIFCITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQ-------DRLYKNINS  470
            EPG    IT + P   Y QG   DL   +  D   P+L+GIGFQ         + +  N 
Sbjct  152  EPGTFMLITMLVPEPAYSQGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNF  211

Query  471  SAKREDL----------------IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNR  514
            +   ++                 + S+G++ AW    T ++R +G+FA   N  +  L R
Sbjct  212  TGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTR  271

Query  515  VF------------GDIDTYTTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAK  562
             F             D +   TYI P  +  +F D  + A NF     F++     +SA 
Sbjct  272  RFTTYFPDDGTGFYQDGEYTGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSAN  331

Query  563  VIPNL  567
             +P L
Sbjct  332  YMPYL  336


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score = 61.6 bits (148),  Expect = 8e-07, Method: Compositional matrix adjust.
 Identities = 61/238 (26%), Positives = 103/238 (43%), Gaps = 30/238 (13%)

Query  319  DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI-------Y  371
            D+S G F++ +L  A  V  +L+    +  ++   ++  Y       VE P        Y
Sbjct  290  DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY  343

Query  372  LGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPDEPGYIFC  424
            LGG   +++  +V   SGT   E +P    LG +AG+G       G I +   E G + C
Sbjct  344  LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC  401

Query  425  ITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIG  482
            I S+ P++ Y      D  ++ LD  D   P+ + +G Q      I+S    +     +G
Sbjct  402  IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG  460

Query  483  KQPAWLDYMTSFNRNYGNFALIENEGWMCLNR-----VFGDIDTYTTYIQPHLYNNIF  535
             QP + +Y T+ + N+G FA  +      ++R      F  ++     I P   N+IF
Sbjct  461  YQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSIF  518


>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
 gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 
317 str. F0108]
Length=541

 Score = 61.6 bits (148),  Expect = 9e-07, Method: Compositional matrix adjust.
 Identities = 58/220 (26%), Positives = 98/220 (45%), Gaps = 30/220 (14%)

Query  371  YLGGSSMEIEFQEVVNNSGTEDQP------------LGSLAGRGVTDNHKGGVIKYKPDE  418
            YLGG    ++  +V   SGT +              LG + G+G    +  G I++   E
Sbjct  329  YLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGY--GEIQFDAKE  386

Query  419  PGYIFCITSITPRVDY-YQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDL  477
            PG + CI S+ P + Y     D  +  +T  D   P+ + +G Q  +   ++ +  +++ 
Sbjct  387  PGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVSLNRAKDN-  445

Query  478  IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTYTTY------IQPHLY  531
              S G QP + +Y T+F+ N+G FA  E   +  + R  G  DT  T+      I PH  
Sbjct  446  --SYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGS-DTLNTFNVAALKINPHWL  502

Query  532  NNIFA----DTDVAAQNFWVQIAFNVEARRVMSAKVIPNL  567
            +++FA     T+V    F     FN+E    M+   +P +
Sbjct  503  DSVFAVNYNGTEVTDCMFGYA-HFNIEKVSDMTEDGMPRV  541


>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
 gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=519

 Score = 59.3 bits (142),  Expect = 4e-06, Method: Compositional matrix adjust.
 Identities = 54/207 (26%), Positives = 92/207 (44%), Gaps = 25/207 (12%)

Query  312  INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI-  370
            +N    +D + G F++ +L  A  V  +L+    +  ++   ++  Y       VE P  
Sbjct  249  LNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDS  302

Query  371  ------YLGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPD  417
                  YLGG   +++  +V   SGT   E +P    LG +AG+G       G I +   
Sbjct  303  RDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGR--GRIVFDAK  360

Query  418  EPGYIFCITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKRE  475
            E G + CI S+ P++ Y      D  ++ LD  D   P+ + +G Q      I+S    +
Sbjct  361  EHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYISSFCTPD  419

Query  476  DLIKSIGKQPAWLDYMTSFNRNYGNFA  502
                 +G QP + +Y T+ + N+G FA
Sbjct  420  PKNPVLGYQPRYSEYKTALDINHGQFA  446


>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568

 Score = 59.7 bits (143),  Expect = 4e-06, Method: Compositional matrix adjust.
 Identities = 49/185 (26%), Positives = 81/185 (44%), Gaps = 20/185 (11%)

Query  371  YLGGSSMEIEFQEVVNNSGT-----EDQPLGSLAGR--GVTDNHKGGVIKYKPDEPGYIF  423
            Y+GG    I+  +V  +SGT     +D   G   GR  G       G I++   E G + 
Sbjct  351  YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM  410

Query  424  CITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNI------NSSAKRE  475
            CI S+ P V Y      D  ++ ++  D   P+ + +G Q    KNI      N++  R 
Sbjct  411  CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI  469

Query  476  DLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD----IDTYTTYIQPHLY  531
              + + G QP + +Y T+ + N+G F   E   +  + R  G+     +  T  I P   
Sbjct  470  KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL  529

Query  532  NNIFA  536
            +++FA
Sbjct  530  DDVFA  534



Lambda      K        H        a         alpha
   0.317    0.136    0.403    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 4166738442540