bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters



Query= Contig-27_CDS_annotation_glimmer3.pl_2_1

Length=574
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|649557305|gb|KDS63784.1|  capsid family protein                    86.7    1e-15
gi|492501782|ref|WP_005867318.1|  hypothetical protein                89.0    2e-15
gi|649569140|gb|KDS75238.1|  capsid family protein                    86.3    6e-15
gi|494610271|ref|WP_007368517.1|  capsid protein                      87.4    7e-15
gi|649555287|gb|KDS61824.1|  capsid family protein                    86.7    1e-14
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  85.9    2e-14
gi|647452987|ref|WP_025792807.1|  hypothetical protein                70.9    1e-09
gi|565841287|ref|WP_023924568.1|  hypothetical protein                66.6    3e-08
gi|494308783|ref|WP_007173938.1|  hypothetical protein                61.6    8e-07
gi|496521299|ref|WP_009229582.1|  capsid protein                      59.7    4e-06


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 86.7 bits (213),  Expect = 1e-15, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  352  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  409
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  34   SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  91

Query  410  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  468
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  92   TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  150

Query  469  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  528
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  151  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  200

Query  529  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  574
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  201  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  245


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 89.0 bits (219),  Expect = 2e-15, Method: Compositional matrix adjust.
 Identities = 75/264 (28%), Positives = 124/264 (47%), Gaps = 17/264 (6%)

Query  314  VDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIFCGGMQ  372
            V+V +  ++++ L     +     R A +   Y     + +G+RS+    + P F GG +
Sbjct  289  VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR  348

Query  373  SEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDY  431
            + I+  E++  SAT+   P   +AG G++       G K    E   I+ + SI PR  Y
Sbjct  349  TPISVSEVLQTSATDSTSPQANMAGHGISA--GVNHGFKRYFEEHGYIIGIMSIRPRTGY  406

Query  432  SQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGKQPSWI  490
             QG  K + +  NMD F+ P    +G QE+  EE     T  + N      + G  P + 
Sbjct  407  QQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEEVYLQQTPASNN-----GTFGYTPRYA  460

Query  491  EYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAESRLSSQ  550
            EY   +NE +G+F   M  AF  LNR++ E+ +      +T+++    N +FA +  S  
Sbjct  461  EYKYSMNEVHGDFRGNM--AFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDD  514

Query  551  NFWVQVAFDVTARRVMSAKQIPNL  574
             +W+Q+  DV A R+M     P L
Sbjct  515  KYWIQLYQDVKALRLMPKYGTPML  538


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 86.3 bits (212),  Expect = 6e-15, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  352  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  409
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  179  SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  236

Query  410  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  468
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  237  TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  295

Query  469  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  528
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  296  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  345

Query  529  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  574
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  346  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  390


>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
 gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 
16608]
Length=531

 Score = 87.4 bits (215),  Expect = 7e-15, Method: Compositional matrix adjust.
 Identities = 83/314 (26%), Positives = 138/314 (44%), Gaps = 55/314 (18%)

Query  301  IDGTTGGINAITAVDVTD--GKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS  358
            + G +  IN ++ + V D      +D ++   +  N L+        Y +  EA +G R 
Sbjct  233  VSGASTFINGVSVLSVNDLRAAFALDKMLEATRRANGLD--------YSSQIEAHFGFR-  283

Query  359  ATLPESPI----FCGGMQSEIAFDEIVSNS-----ATEEEPLGTLAGRGVATMYKSGRGL  409
              +PES      F GG  + +   E+V+ S     A E   LG L G+GV ++  S    
Sbjct  284  --VPESRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGVGSLNSSSIDF  341

Query  410  KIKCTEPSMIMALGSITPRIDYSQGNKW--WTRLQNMDDFHKPTLDAIGFQ-----ELIt  462
             +K  E  +IM + S+ P+ +Y+ G  +  + R    +DF +P    +G+Q     +LI+
Sbjct  342  DVK--EHGIIMCIYSVVPQTEYN-GTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLIS  398

Query  463  eeaaawntettenYKHIYSS------------LGKQPSWIEYTTDVNETYGEFAAGMPLA  510
                    +  E  K + +             LG Q  + EY T  +  +GEF +G+ L+
Sbjct  399  TYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLS  458

Query  511  FMCLNRVYEENTDHTIGN----------ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDV  560
            + C  R Y+   D   G+          A  Y++P+I N+IF  S + + +F V   FDV
Sbjct  459  YWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV  517

Query  561  TARRVMSAKQIPNL  574
             A R MS   +  L
Sbjct  518  KAVRPMSVSGLAGL  531


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 86.7 bits (213),  Expect = 1e-14, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  352  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  409
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  330  SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  387

Query  410  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  468
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  388  TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  446

Query  469  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  528
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  447  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  496

Query  529  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  574
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  497  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  541


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 85.9 bits (211),  Expect = 2e-14, Method: Compositional matrix adjust.
 Identities = 73/269 (27%), Positives = 121/269 (45%), Gaps = 17/269 (6%)

Query  309  NAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIF  367
            N    V+V +  + ++ L     +     R A     Y     + +G+RS+    + P F
Sbjct  299  NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF  358

Query  368  CGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSIT  426
             GG +  I+  E++  S+T+E  P   +AG G++    +G   K    E   I+ + SIT
Sbjct  359  LGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSIT  416

Query  427  PRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGK  485
            PR  Y QG  + +T+  NMD F+ P    +  QE+   +           Y +   + G 
Sbjct  417  PRSGYQQGVPRDFTKFDNMD-FYFPEFAHLSEQEI---KNQELFVSEDAAYNN--GTFGY  470

Query  486  QPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAES  545
             P + EY    +E +G+F     L+F  LNR++E+  +      +T+++    N +FA S
Sbjct  471  TPRYAEYKYHPSEAHGDFRGN--LSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATS  524

Query  546  RLSSQNFWVQVAFDVTARRVMSAKQIPNL  574
                  FWVQ+  DV A R+M     P L
Sbjct  525  ETEDDKFWVQMYQDVKALRLMPKYGTPML  553


>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584

 Score = 70.9 bits (172),  Expect = 1e-09, Method: Compositional matrix adjust.
 Identities = 74/270 (27%), Positives = 118/270 (44%), Gaps = 50/270 (19%)

Query  346  YQAWREATYGIRSATLPESPI----FCGGMQSEIAFDEIVS---NSATE--EEPLGTLAG  396
            Y +  EA +G +   +PES      F GG  + I   E+VS   N+A++     +G L G
Sbjct  324  YASQIEAHFGFK---VPESRANDARFLGGFDNSIVVSEVVSTNGNAASDGSHASIGDLGG  380

Query  397  RGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQG-----NKWWTRLQNMDDFHKPT  451
            +G+ +M  S   ++   TE  +IM + S+ P+ +Y+       N+  TR Q    F++P 
Sbjct  381  KGIGSM--SSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRKLTREQ----FYQPE  434

Query  452  LDAIGFQELIte---eaaawntettenYKHIYSS---LGKQPSWIEYTTDVNETYGEFAA  505
               +G+Q LI      +     E    +  I  +   LG Q  + EY T  +  +G+F +
Sbjct  435  FADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNNLLGYQVRYNEYKTARDLVFGDFES  494

Query  506  GMPLAFMCLNR-----------VYEENTD----HTIGNAST------YIDPTIYNSIFAE  544
            G  L++ C  R           +  EN         GN S       YI+P + N IF  
Sbjct  495  GKSLSYWCTPRFDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYINPNLVNPIFLT  554

Query  545  SRLSSQNFWVQVAFDVTARRVMSAKQIPNL  574
            S + + +F V    DV A R MS   + +L
Sbjct  555  SAVQADHFIVNSFLDVKAVRPMSVTGLSSL  584


>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
 gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens 
CC14M]
Length=656

 Score = 66.6 bits (161),  Expect = 3e-08, Method: Compositional matrix adjust.
 Identities = 82/335 (24%), Positives = 141/335 (42%), Gaps = 43/335 (13%)

Query  257  GTIELPNYNRTKVYRSSNAWFSQAGLAVKTYLSDRFNNWLNTEWIDGTTGGINAITAVDV  316
            G  ELP+Y       + N  F  A   VK  + +   + L  + +D  + G N I+ +  
Sbjct  342  GIFELPDY------INGNTGF--ATTEVKRDVVNNRGSQLEIKSMDAGSLGSNNISYISP  393

Query  317  TDGKLTMDALILQKKIFNMLNRVAITDGT-YQAWREATYGIRSATLPESPIFC----GGM  371
             D    + A+   +K   ML R    +G  Y     A +G +   +PES   C    GG 
Sbjct  394  ND----IRAMFALEK---MLERTRAANGLDYSNQIAAHFGFK---VPESRKNCASFIGGF  443

Query  372  QSEIAFDEIVSNS-------ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGS  424
             ++I+  E+V+ S       A+    +G + G+G+  M        +K  E  +IM + S
Sbjct  444  DNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAMNSGHISYDVK--EHGLIMCIYS  501

Query  425  ITPRIDY-SQGNKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSS-  482
            I P++DY ++    + R  + +D+ +P  + +G Q +I  +          +    +++ 
Sbjct  502  IAPQVDYDARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNNV  561

Query  483  LGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNAS---TYIDPTIYN  539
            LG    ++EY T  +  +GEF +G  L+     +    N     G  S     +DP +  
Sbjct  562  LGYSARYLEYKTARDIIFGEFMSGGSLSAWATPK---NNYTFEFGKLSLPDLLVDPKVLE  618

Query  540  SIFA---ESRLSSQNFWVQVAFDVTARRVMSAKQI  571
             IFA      +S+  F V   FDV A R M    +
Sbjct  619  PIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVNDM  653


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score = 61.6 bits (148),  Expect = 8e-07, Method: Compositional matrix adjust.
 Identities = 65/250 (26%), Positives = 109/250 (44%), Gaps = 29/250 (12%)

Query  308  INAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLPESPI-  366
            +N     D ++G  ++ +L     +  +L+       T+Q    A YG+      +  + 
Sbjct  283  VNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVN  342

Query  367  FCGGMQSEIAFDEIVSNS---ATEEEP----LGTLAGRGVATMYKSGRG-LKIKCTEPSM  418
            + GG  S++   ++   S   ATE +P    LG +AG+G      SGRG +     E  +
Sbjct  343  YLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTG----SGRGRIVFDAKEHGV  398

Query  419  IMALGSITPRIDYSQGNKWWTRLQNMDD------FHKPTLDAIGFQELIteeaaawntet  472
            +M + S+ P+I Y       TRL  M D      +  P  + +G Q L +   +++ T  
Sbjct  399  LMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTD  453

Query  473  tenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTY  532
             +N       LG QP + EY T ++  +G+FA    L+   ++R     T   +  A   
Sbjct  454  PKN-----PVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIADFK  508

Query  533  IDPTIYNSIF  542
            IDP   NSIF
Sbjct  509  IDPGCLNSIF  518


>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
 gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 
317 str. F0108]
Length=541

 Score = 59.7 bits (143),  Expect = 4e-06, Method: Compositional matrix adjust.
 Identities = 58/257 (23%), Positives = 107/257 (42%), Gaps = 32/257 (12%)

Query  302  DGTTGGINAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDG-TYQAWREATYGIRSAT  360
            DG +  +N + + DV +      A  L K     L  +++  G TY    EA +G+  + 
Sbjct  268  DGNSAKLN-MASPDVLNVSAIRSAFALDK-----LLSISMRAGKTYAEQIEAHFGVTVSE  321

Query  361  LPESPIF-CGGMQSEIAFDEIVSNSATEEEP------------LGTLAGRGVATMYKSGR  407
              +  ++  GG  S +   ++   S T                LG + G+G  + Y    
Sbjct  322  GRDGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGE--  379

Query  408  GLKIKCTEPSMIMALGSITPRIDYSQGN-KWWTRLQNMDDFHKPTLDAIGFQELIteeaa  466
             ++    EP ++M + S+ P + Y       +   Q   D+  P  + +G Q ++    +
Sbjct  380  -IQFDAKEPGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVS  438

Query  467  awntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTI  526
                +         +S G QP + EY T  +  +G+FA G PL++  + R    +T +T 
Sbjct  439  LNRAKD--------NSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTF  490

Query  527  GNASTYIDPTIYNSIFA  543
              A+  I+P   +S+FA
Sbjct  491  NVAALKINPHWLDSVFA  507



Lambda      K        H        a         alpha
   0.317    0.133    0.399    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 4206541246740