bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-31_CDS_annotation_glimmer3.pl_2_2

Length=322
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|649557305|gb|KDS63784.1|  capsid family protein                    83.6    1e-15
gi|492501782|ref|WP_005867318.1|  hypothetical protein                84.7    6e-15
gi|649569140|gb|KDS75238.1|  capsid family protein                    83.6    8e-15
gi|649555287|gb|KDS61824.1|  capsid family protein                    83.2    2e-14
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  82.0    4e-14
gi|494610271|ref|WP_007368517.1|  capsid protein                      81.6    6e-14
gi|494308783|ref|WP_007173938.1|  hypothetical protein                75.5    7e-12
gi|494306153|ref|WP_007173049.1|  hypothetical protein                74.7    1e-11
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          74.3    2e-11
gi|496521299|ref|WP_009229582.1|  capsid protein                      71.6    1e-10


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 83.6 bits (205),  Expect = 1e-15, Method: Compositional matrix adjust.
 Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)

Query  79   LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE  137
              R A SG  Y +   + FGVR S A  + P ++GG  + I   EV+ T++ +S      
Sbjct  18   FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS  73

Query  138  PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH  196
            P  ++AG G       G       EE   IM + SI PR  Y QG  K + + D M DF+
Sbjct  74   PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY  130

Query  197  KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP  256
             P    +G QE+ ++E++     +N +    + + G  P + EY  + NE +GDF     
Sbjct  131  FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN--  183

Query  257  LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA  316
            + +   NR++     P L   TT+++    N+ FA +      +W+QI  D+   R+   
Sbjct  184  MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK  239

Query  317  REIPNL  322
               P L
Sbjct  240  YGTPML  245


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 84.7 bits (208),  Expect = 6e-15, Method: Compositional matrix adjust.
 Identities = 72/246 (29%), Positives = 115/246 (47%), Gaps = 20/246 (8%)

Query  79   LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE  137
              R A SG  Y +   + FGVR S A  + P ++GG  + I   EV+ T+A +S      
Sbjct  311  FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS----TS  366

Query  138  PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH  196
            P  ++AG G       G   K   EE   I+ + SI PR  Y QG  K + + D M DF+
Sbjct  367  PQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNM-DFY  423

Query  197  KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP  256
             P    +G QE+ ++E++     +     + + + G  P + EY  ++NE +GDF     
Sbjct  424  FPEFAHLGEQEIKNEEVY-----LQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFRGN--  476

Query  257  LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA  316
            + +   NR++  +  P L   TT+++    N+ FA +      +WIQ+  DV   R+   
Sbjct  477  MAFWHLNRIF--SESPNLN--TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK  532

Query  317  REIPNL  322
               P L
Sbjct  533  YGTPML  538


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 83.6 bits (205),  Expect = 8e-15, Method: Compositional matrix adjust.
 Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)

Query  79   LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE  137
              R A SG  Y +   + FGVR S A  + P ++GG  + I   EV+ T++ +S      
Sbjct  163  FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS  218

Query  138  PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH  196
            P  ++AG G       G       EE   IM + SI PR  Y QG  K + + D M DF+
Sbjct  219  PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY  275

Query  197  KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP  256
             P    +G QE+ ++E++     +N +    + + G  P + EY  + NE +GDF     
Sbjct  276  FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN--  328

Query  257  LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA  316
            + +   NR++     P L   TT+++    N+ FA +      +W+QI  D+   R+   
Sbjct  329  MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK  384

Query  317  REIPNL  322
               P L
Sbjct  385  YGTPML  390


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 83.2 bits (204),  Expect = 2e-14, Method: Compositional matrix adjust.
 Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)

Query  79   LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE  137
              R A SG  Y +   + FGVR S A  + P ++GG  + I   EV+ T++ +S      
Sbjct  314  FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS  369

Query  138  PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH  196
            P  ++AG G       G       EE   IM + SI PR  Y QG  K + + D M DF+
Sbjct  370  PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY  426

Query  197  KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP  256
             P    +G QE+ ++E++     +N +    + + G  P + EY  + NE +GDF     
Sbjct  427  FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN--  479

Query  257  LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA  316
            + +   NR++     P L   TT+++    N+ FA +      +W+QI  D+   R+   
Sbjct  480  MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK  535

Query  317  REIPNL  322
               P L
Sbjct  536  YGTPML  541


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 82.0 bits (201),  Expect = 4e-14, Method: Compositional matrix adjust.
 Identities = 73/246 (30%), Positives = 115/246 (47%), Gaps = 20/246 (8%)

Query  79   LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE  137
              R A  G  Y +   + FGVR S A  + P ++GG    I   EV+ T++  + ET   
Sbjct  326  FERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGRMPISVSEVLQTSS--TDETS--  381

Query  138  PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH  196
            P  ++AG G       G   K   EE   I+ + SI PR  Y QG  + +T+ D M DF+
Sbjct  382  PQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQGVPRDFTKFDNM-DFY  438

Query  197  KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP  256
             P    +  QE+ +QE+      V+ +    + + G  P + EY    +E +GDF     
Sbjct  439  FPEFAHLSEQEIKNQELF-----VSEDAAYNNGTFGYTPRYAEYKYHPSEAHGDFRGN--  491

Query  257  LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA  316
            L +   NR+++  + P L   TT+++ +  N+ FA S  +   FW+Q+  DV   R+   
Sbjct  492  LSFWHLNRIFE--DKPNLN--TTFVECKPSNRVFATSETEDDKFWVQMYQDVKALRLMPK  547

Query  317  REIPNL  322
               P L
Sbjct  548  YGTPML  553


>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
 gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 
16608]
Length=531

 Score = 81.6 bits (200),  Expect = 6e-14, Method: Compositional matrix adjust.
 Identities = 74/272 (27%), Positives = 117/272 (43%), Gaps = 44/272 (16%)

Query  86   GGSYRDWQEAVFGVRV--SRAAESPIYVGGYASEIVFDEVVSTAAFESGETGQEPLGSLA  143
            G  Y    EA FG RV  SRA ++  ++GG+ + +V  EVV+ + F+ G      LG L 
Sbjct  269  GLDYSSQIEAHFGFRVPESRAGDA-RFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLG  327

Query  144  GRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTMN------DFHK  197
            G+G         +I    +E  +IM + S+VP+ +Y+      T  D  N      DF +
Sbjct  328  GKG--VGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNG-----TYFDPFNRKLRREDFFQ  380

Query  198  PNLDQIGFQELLSQEMHG------------RAWRVNANYKTTDFS-----VGKQPAWTEY  240
            P    +G+Q +++ ++              +  R+ A Y  +        +G Q  + EY
Sbjct  381  PEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEY  440

Query  241  TTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKD----------ATTYIDPQIFNKAF  290
             T+ +  +G+F +G  L Y    R YD   D K  D          A  Y++P I N  F
Sbjct  441  KTSRDLVFGEFESGLSLSYWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIF  499

Query  291  ANSNLDAKNFWIQIGFDVIGRRVKSAREIPNL  322
              S + A +F +   FDV   R  S   +  L
Sbjct  500  LVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL  531


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score = 75.5 bits (184),  Expect = 7e-12, Method: Compositional matrix adjust.
 Identities = 67/259 (26%), Positives = 120/259 (46%), Gaps = 30/259 (12%)

Query  46   DGTN--GINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSR  103
            DG+N   +N     D + G  ++ +L     V  +L+    +G +++D   A +GV +  
Sbjct  276  DGSNFTRVNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPD  335

Query  104  AAESPI-YVGGYASEIVFDEVVSTAAFESGETGQEP--LGSLAGRGRETSKRGGKNIKIR  160
            + +  + Y+GG+ S++   +V  T+   + E   E   LG +AG+G   +  G   I   
Sbjct  336  SRDGRVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKG---TGSGRGRIVFD  392

Query  161  CEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM------NDFHKPNLDQIGFQELLSQEMH  214
             +E  ++M + S+VP++ Y       TR+D M       D+  P  + +G Q L S  + 
Sbjct  393  AKEHGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYI-  446

Query  215  GRAWRVNANYKTTDFS---VGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVAND  271
                   +++ TTD     +G QP ++EY T ++  +G FA  + L   + +R       
Sbjct  447  -------SSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTF  499

Query  272  PKLKDATTYIDPQIFNKAF  290
            P+L+ A   IDP   N  F
Sbjct  500  PQLEIADFKIDPGCLNSIF  518


>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
 gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=519

 Score = 74.7 bits (182),  Expect = 1e-11, Method: Compositional matrix adjust.
 Identities = 71/285 (25%), Positives = 126/285 (44%), Gaps = 32/285 (11%)

Query  22   SQCGLGIRTYLSDRFNNWLNTEWIDGTNG----INEITSVDVTSGLLTMDALILQKKVYD  77
            SQ    I  +  D   N+   ++ D +      +N    VD   G  ++ +L     V  
Sbjct  216  SQLFTFIPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDK  275

Query  78   MLNRIAVSGGSYRDWQEAVFGVRVSRAAESPI-YVGGYASEIVFDEVVSTAAFESGETGQ  136
            +L+    +G +++D   A +GV +  + +  + Y+GG+ S++   +V  T+   + E   
Sbjct  276  LLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP  335

Query  137  EP--LGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM--  192
            E   LG +AG+G   +  G   I    +E  ++M + S+VP++ Y       TR+D M  
Sbjct  336  EAGYLGRIAGKG---TGSGRGRIVFDAKEHGVLMCIYSLVPQIQYD-----CTRLDPMVD  387

Query  193  ----NDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFS---VGKQPAWTEYTTTVN  245
                 DF  P  + +G Q L S  +        +++ T D     +G QP ++EY T ++
Sbjct  388  KLDRFDFFTPEFENLGMQPLNSSYI--------SSFCTPDPKNPVLGYQPRYSEYKTALD  439

Query  246  ETYGDFAAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAF  290
              +G FA  + L   + +R       P+L+ A   IDP   N  F
Sbjct  440  INHGQFAQNDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSVF  484


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 74.3 bits (181),  Expect = 2e-11, Method: Compositional matrix adjust.
 Identities = 62/244 (25%), Positives = 114/244 (47%), Gaps = 16/244 (7%)

Query  74   KVYDMLNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESG  132
            K+ + L + A +G  Y +   + FGV+ S    + P ++GG  S I+  EV+  +A +S 
Sbjct  299  KLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMISEVLQQSATDS-  357

Query  133  ETGQEPLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDT  191
                 P G++AG G    K GG       EE   ++ L S++P+  YSQG  + +++ D 
Sbjct  358  ---TTPQGNMAGHGIGIGKDGG--FSRFFEEHGYVIGLMSVIPKTSYSQGIPRHFSKSDK  412

Query  192  MNDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDF  251
              D+  P  + IG Q + ++E+  +    N +   ++   G  P ++EY  + +  +GDF
Sbjct  413  F-DYFWPQFEHIGEQPVYNKEIFAK----NIDAFDSEAVFGYLPRYSEYKFSPSTVHGDF  467

Query  252  AAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGR  311
               + L +    R++D    P L  +    D    ++ FA  + D   F+  +   +  +
Sbjct  468  K--DDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVED-DTDKFYCHLYQKITAK  524

Query  312  RVKS  315
            R  S
Sbjct  525  RKMS  528


>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
 gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 
317 str. F0108]
Length=541

 Score = 71.6 bits (174),  Expect = 1e-10, Method: Compositional matrix adjust.
 Identities = 70/257 (27%), Positives = 110/257 (43%), Gaps = 28/257 (11%)

Query  46   DGTNGINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRAA  105
            DG +    + S DV +      A  L K    +L+    +G +Y +  EA FGV VS   
Sbjct  268  DGNSAKLNMASPDVLNVSAIRSAFALDK----LLSISMRAGKTYAEQIEAHFGVTVSEGR  323

Query  106  ESPIY-VGGYASEIVFDEVVSTAAFESGETGQEPLGSLAGR-GRETSKRGGK---NIKIR  160
            +  +Y +GG+ S +   +V  T+   +    +     LAG  G+ T K  G     I+  
Sbjct  324  DGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGEIQFD  383

Query  161  CEEPSLIMILGSIVPRVDYSQGNKWWTRVD------TMNDFHKPNLDQIGFQELLSQEMH  214
             +EP ++M + S+VP + Y        R+D      T  D+  P  + +G Q ++     
Sbjct  384  AKEPGVLMCIYSVVPAMQYD-----CMRLDPFVAKQTRGDYFIPEFENLGMQPIVPA---  435

Query  215  GRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKL  274
                 V+ N +  D S G QP ++EY T  +  +G FA GEPL Y +  R          
Sbjct  436  ----FVSLN-RAKDNSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTF  490

Query  275  KDATTYIDPQIFNKAFA  291
              A   I+P   +  FA
Sbjct  491  NVAALKINPHWLDSVFA  507



Lambda      K        H        a         alpha
   0.317    0.134    0.404    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 1793877651450