bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-21_CDS_annotation_glimmer3.pl_2_7

Length=336
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|492501782|ref|WP_005867318.1|  hypothetical protein                79.3    4e-13
gi|649557305|gb|KDS63784.1|  capsid family protein                    75.1    2e-12
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  76.6    4e-12
gi|649569140|gb|KDS75238.1|  capsid family protein                    74.7    1e-11
gi|649555287|gb|KDS61824.1|  capsid family protein                    74.3    2e-11
gi|547312923|ref|WP_022044635.1|  putative uncharacterized protein    72.0    6e-11
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          71.6    2e-10
gi|639237429|ref|WP_024568106.1|  hypothetical protein                69.7    7e-10
gi|599087551|gb|AHN52701.1|  major capsid protein                     57.4    2e-06
gi|599088027|gb|AHN52939.1|  major capsid protein                     57.0    3e-06


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 79.3 bits (194),  Expect = 4e-13, Method: Compositional matrix adjust.
 Identities = 73/266 (27%), Positives = 120/266 (45%), Gaps = 22/266 (8%)

Query  77   IDVSSGSLTMDTLNLAKKVYDMLNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGG  133
            ++V    ++++ L  +  +     R A SG  Y + I +   V ++D   R + P + GG
Sbjct  289  VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGG  346

Query  134  FSSEIIFQEVISNSATEN-EPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRID  192
              + I   EV+  SAT++  P   +AG G + G+  G  K   +E  YIIGI+SI PR  
Sbjct  347  GRTPISVSEVLQTSATDSTSPQANMAGHGISAGVNHG-FKRYFEEHGYIIGIMSIRPRTG  405

Query  193  YSQGNRFDVDLDTLD--DLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPA  250
            Y QG     D    D  D + P    +G Q++   ++       + +G     + G  P 
Sbjct  406  YQQG--VPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNG-----TFGYTPR  458

Query  251  WLDYMTNYNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLN  310
            + +Y  + N+V G+F  + +  F  LNR +    N +    TT+++    N VFA    +
Sbjct  459  YAEYKYSMNEVHGDF--RGNMAFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETS  512

Query  311  AMNFWVQLGIGAKVRRKMSAKVIPNL  336
               +W+QL    K  R M     P L
Sbjct  513  DDKYWIQLYQDVKALRLMPKYGTPML  538


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 75.1 bits (183),  Expect = 2e-12, Method: Compositional matrix adjust.
 Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)

Query  99   LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL  154
              R A SG  Y + I +   V ++D   R + P + GG  + I   EV+  S+T++  P 
Sbjct  18   FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ  75

Query  155  GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP  212
              +AG G + G+  G  +   +E  YI+GI+SI PR  Y QG     D    D  D + P
Sbjct  76   ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP  132

Query  213  ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM  272
                +G Q++   ++   +     +G     + G  P + +Y  + N+V G+F  + +  
Sbjct  133  EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA  185

Query  273  FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV  332
            F  LNR ++   N +    TT+++    N VFA    +   +WVQ+    K  R M    
Sbjct  186  FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG  241

Query  333  IPNL  336
             P L
Sbjct  242  TPML  245


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 76.6 bits (187),  Expect = 4e-12, Method: Compositional matrix adjust.
 Identities = 72/269 (27%), Positives = 117/269 (43%), Gaps = 18/269 (7%)

Query  72   NEITAIDVSSGSLTMDTLNLAKKVYDMLNRIAVSGGTYQDWIQT---VYTNDYIERSETP  128
            N    ++V    + ++ L  +  +     R A  G  Y + I +   V ++D   R + P
Sbjct  299  NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSD--ARLQRP  356

Query  129  VYEGGFSSEIIFQEVISNSAT-ENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSI  187
             + GG    I   EV+  S+T E  P   +AG G + G+  G  K   +E  YIIGI+SI
Sbjct  357  QFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYIIGIMSI  415

Query  188  TPRIDYSQGNRFDVDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGK  247
            TPR  Y QG   D       D + P    +  Q++   ++      ++ D      + G 
Sbjct  416  TPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQEL-----FVSEDAAYNNGTFGY  470

Query  248  QPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADT  307
             P + +Y  + ++  G+F  + +  F  LNR +E   N +    TT+++ +  N VFA +
Sbjct  471  TPRYAEYKYHPSEAHGDF--RGNLSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATS  524

Query  308  SLNAMNFWVQLGIGAKVRRKMSAKVIPNL  336
                  FWVQ+    K  R M     P L
Sbjct  525  ETEDDKFWVQMYQDVKALRLMPKYGTPML  553


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 74.7 bits (182),  Expect = 1e-11, Method: Compositional matrix adjust.
 Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)

Query  99   LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL  154
              R A SG  Y + I +   V ++D   R + P + GG  + I   EV+  S+T++  P 
Sbjct  163  FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ  220

Query  155  GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP  212
              +AG G + G+  G  +   +E  YI+GI+SI PR  Y QG     D    D  D + P
Sbjct  221  ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP  277

Query  213  ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM  272
                +G Q++   ++   +     +G     + G  P + +Y  + N+V G+F  + +  
Sbjct  278  EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA  330

Query  273  FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV  332
            F  LNR ++   N +    TT+++    N VFA    +   +WVQ+    K  R M    
Sbjct  331  FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG  386

Query  333  IPNL  336
             P L
Sbjct  387  TPML  390


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 74.3 bits (181),  Expect = 2e-11, Method: Compositional matrix adjust.
 Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)

Query  99   LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL  154
              R A SG  Y + I +   V ++D   R + P + GG  + I   EV+  S+T++  P 
Sbjct  314  FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ  371

Query  155  GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP  212
              +AG G + G+  G  +   +E  YI+GI+SI PR  Y QG     D    D  D + P
Sbjct  372  ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP  428

Query  213  ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM  272
                +G Q++   ++   +     +G     + G  P + +Y  + N+V G+F  + +  
Sbjct  429  EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA  481

Query  273  FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV  332
            F  LNR ++   N +    TT+++    N VFA    +   +WVQ+    K  R M    
Sbjct  482  FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG  537

Query  333  IPNL  336
             P L
Sbjct  538  TPML  541


>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
 gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338

 Score = 72.0 bits (175),  Expect = 6e-11, Method: Compositional matrix adjust.
 Identities = 85/336 (25%), Positives = 133/336 (40%), Gaps = 54/336 (16%)

Query  45   GLALKTYQSDIFNNWINTEWLDGEGGINEITA-----IDVSSG-SLTMDTLNLAKKVYDM  98
            GL    Y  D+F N I      G     EI       +++S+G S+ +  L L  K+ + 
Sbjct  11   GLLSVPYSPDLFGNIIK----QGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNW  66

Query  99   LNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQEVISNSATENEPLGTL  157
            ++R+ VSGG   D  +T++ T         P + G      ++Q  I+ S       G+ 
Sbjct  67   MDRLFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPSNVRAMANGSA  120

Query  158  AGRGQNTGMKGGTVKIKID------------EPSYIIGIVSITPRIDYSQGNRFDVDLDT  205
            +G   N G     V    D            EP   + I  + P   YSQG   D+   +
Sbjct  121  SGEDANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQGLHPDLASIS  180

Query  206  LDDLHKPALDAIGFQDLTTNKMA-----------------WWDETITAD-GEKQLKSVGK  247
              D   P L+ IGFQ +  ++ +                 W+  T T    +  + SVG+
Sbjct  181  FGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMVSVGE  240

Query  248  QPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRN---YEMDENKSIAD----LTTYIDPEKY  300
            + AW    T+Y+++ G+FA   +  +  L R    Y  D+            TYI+P  +
Sbjct  241  EVAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYINPLDW  300

Query  301  NYVFADTSLNAMNFWVQLGIGAKVRRKMSAKVIPNL  336
             YVF D +L A NF         V   +SA  +P L
Sbjct  301  QYVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL  336


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 71.6 bits (174),  Expect = 2e-10, Method: Compositional matrix adjust.
 Identities = 72/252 (29%), Positives = 112/252 (44%), Gaps = 20/252 (8%)

Query  85   TMDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGGFSSEIIFQ  141
            T++ L  A K+ + L + A +G  Y + I + +   T+D   R + P + GG  S I+  
Sbjct  290  TVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGGNKSPIMIS  347

Query  142  EVISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFD  200
            EV+  SAT++  P G +AG G   G  GG  +   +E  Y+IG++S+ P+  YSQG    
Sbjct  348  EVLQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRH  406

Query  201  VDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNK  260
                   D   P  + IG Q +  NK  +       D E      G  P + +Y  + + 
Sbjct  407  FSKSDKFDYFWPQFEHIGEQPV-YNKEIFAKNIDAFDSEAVF---GYLPRYSEYKFSPST  462

Query  261  VFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFA---DTSLNAMNFWVQ  317
            V G+F  KD   F  L R ++ D+   +       D    + +FA   DT      F+  
Sbjct  463  VHGDF--KDDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVEDDTD----KFYCH  516

Query  318  LGIGAKVRRKMS  329
            L      +RKMS
Sbjct  517  LYQKITAKRKMS  528


>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546

 Score = 69.7 bits (169),  Expect = 7e-10, Method: Compositional matrix adjust.
 Identities = 71/261 (27%), Positives = 122/261 (47%), Gaps = 23/261 (9%)

Query  77   IDVSSGSLTMDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGG  133
            +  +SGS T++ L  A K+ + L + A +G  Y + I + +   T+D   R + P + GG
Sbjct  292  LKTASGS-TINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGG  348

Query  134  FSSEIIFQEVISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRID  192
              + I+  EV+  S+T++  P G +AG G + G +GG  K   +E  Y+IG++S+ P+  
Sbjct  349  NKTPILISEVLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTS  407

Query  193  YSQG-NRFDVDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKS---VGKQ  248
            YSQG  R     D  D    P  + IG Q +       +++ I A       S    G  
Sbjct  408  YSQGIPRHFSKFDKFDYFW-PQFEHIGEQPV-------YNKEIFAKNVGDYDSGGVFGYV  459

Query  249  PAWLDYMTNYNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTS  308
            P + +Y  + + + G+F  KD+  F  L R ++      +      ++    + +FA   
Sbjct  460  PRYSEYKYSPSTIHGDF--KDTLYFWHLGRIFDSSAPPKLNRDFIEVNKSGLSRIFA-VE  516

Query  309  LNAMNFWVQLGIGAKVRRKMS  329
             N+  F+  L      +RKMS
Sbjct  517  DNSDKFYCHLYQKITAKRKMS  537


>gi|599087551|gb|AHN52701.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=220

 Score = 57.4 bits (137),  Expect = 2e-06, Method: Compositional matrix adjust.
 Identities = 45/139 (32%), Positives = 64/139 (46%), Gaps = 2/139 (1%)

Query  83   SLTMDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQ  141
            + T++ L  A ++  +L R A  G  Y + IQ  +       R + P Y GG ++ II  
Sbjct  75   AATINQLRQAFQIQKLLERDARGGTRYTEIIQAHFGVTSPDARLQRPEYLGGGTTPIIIS  134

Query  142  EVISNSATENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDV  201
            +V   S ++  P GTLA  G  T  K G  K    E   IIG+ S+   + Y QG     
Sbjct  135  QVPQTSESDGTPQGTLAAYGTATMRKAGFTK-SFTEHCVIIGLASVRADLTYQQGLERMW  193

Query  202  DLDTLDDLHKPALDAIGFQ  220
               T  D++ PAL  IG Q
Sbjct  194  SRQTRYDVYWPALAMIGEQ  212


>gi|599088027|gb|AHN52939.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=219

 Score = 57.0 bits (136),  Expect = 3e-06, Method: Compositional matrix adjust.
 Identities = 45/158 (28%), Positives = 74/158 (47%), Gaps = 3/158 (2%)

Query  67   GEGGI--NEITAIDVSSGSLTMDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVYTNDYIER  124
            G+ G+  N++ A    + + T++ L  A ++  +L R A SG  Y + ++  +  ++++ 
Sbjct  57   GDAGVQANQLYADLSQATAATINQLRQAFQIQKLLERDARSGTRYAEIVKAHFGVNFMDV  116

Query  125  SETPVYEGGFSSEIIFQEVISNSATENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGI  184
            +  P + GG S+ I    V   S +   P GTLA  G  T   GG  K    E   ++GI
Sbjct  117  TYRPEFLGGTSTPINVTSVPQTSESGTTPQGTLAAFGTATVNGGGFTK-SFTEHCIVMGI  175

Query  185  VSITPRIDYSQGNRFDVDLDTLDDLHKPALDAIGFQDL  222
             S+   + Y QG        T  D + PAL  IG Q +
Sbjct  176  ASVRADLTYQQGLNRMFSRSTRYDFYFPALAHIGEQAV  213



Lambda      K        H        a         alpha
   0.314    0.132    0.385    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 1938212634900