bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters



Query= Contig-36_CDS_annotation_glimmer3.pl_2_1

Length=264
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|444298142|dbj|GAC77768.1|  major capsid protein                    52.0    1e-04
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          52.4    1e-04
gi|639237429|ref|WP_024568106.1|  hypothetical protein                52.0    2e-04
gi|492501782|ref|WP_005867318.1|  hypothetical protein                51.6    2e-04
gi|547312923|ref|WP_022044635.1|  putative uncharacterized protein    50.8    4e-04
gi|599087433|gb|AHN52642.1|  major capsid protein                     49.3    0.001
gi|575096056|emb|CDL66947.1|  unnamed protein product                 48.5    0.002
gi|649557305|gb|KDS63784.1|  capsid family protein                    47.8    0.003
gi|649569140|gb|KDS75238.1|  capsid family protein                    46.6    0.010
gi|444297974|dbj|GAC77852.1|  major capsid protein                    45.4    0.014


>gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus]
Length=299

 Score = 52.0 bits (123),  Expect = 1e-04, Method: Compositional matrix adjust.
 Identities = 37/146 (25%), Positives = 66/146 (45%), Gaps = 4/146 (3%)

Query  2    AGVQTIPQLAIASRLQEYKDLLGAGGSRYSDWLETF-FASKIEHVDRPKLIFSASQTVNV  60
            AG   I  L  A  LQ Y++     G+R++++L     +S    + RP++I +    +N 
Sbjct  116  AGAININDLREAFALQRYQEARNLYGARFTEYLRYLGISSSXGRLQRPEMISTGKSNINF  175

Query  61   QVIMNQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRP-VYYWAT  119
              ++N +G +      PLG+ GG          R  Y+  E G++I ++S+RP   Y  T
Sbjct  176  SEVLNTTGPSGVD-DHPLGEMGGH-GIAGVKSNRARYFCEEHGHIISLMSVRPKTIYMTT  233

Query  120  ITPDYLSYRGADYFNPIYNDIGYQDI  145
                +      DY+      IG +++
Sbjct  234  QHKQFDRESKEDYWQKELQAIGMEEV  259


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 52.4 bits (124),  Expect = 1e-04, Method: Compositional matrix adjust.
 Identities = 37/145 (26%), Positives = 67/145 (46%), Gaps = 7/145 (5%)

Query  4    VQTIPQLAIASRLQEYKDLLGAGGSRYSDWLETFFASKIE--HVDRPKLIFSASQTVNVQ  61
            V T+  L  A +LQE+ +     GSRY++ + +FF  K     + RP+ +      + + 
Sbjct  288  VSTVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMIS  347

Query  62   VIMNQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRPVYYWATIT  121
             ++ QS  ++ +    +   G  I  +    R    +F E GY+I ++S+ P   ++   
Sbjct  348  EVLQQSATDSTTPQGNMAGHGIGIGKDGGFSR----FFEEHGYVIGLMSVIPKTSYSQGI  403

Query  122  PDYLSYRGA-DYFNPIYNDIGYQDI  145
            P + S     DYF P +  IG Q +
Sbjct  404  PRHFSKSDKFDYFWPQFEHIGEQPV  428


>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546

 Score = 52.0 bits (123),  Expect = 2e-04, Method: Compositional matrix adjust.
 Identities = 42/148 (28%), Positives = 70/148 (47%), Gaps = 9/148 (6%)

Query  2    AGVQTIPQLAIASRLQEYKDLLGAGGSRYSDWLETFFASKIE--HVDRPKLIFSASQTVN  59
            A   TI  L  A +LQE+ +     GSRY++ + +FF  K     + RP+ +      + 
Sbjct  295  ASGSTINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKTPIL  354

Query  60   VQVIMNQSGLNNFSGTQPLGQQGG-AIAFNDRLGRRQSYYFREPGYMIDMLSIRPVYYWA  118
            +  ++ QS  ++   T P G   G  I+     G   S +F E GY+I ++S+ P   ++
Sbjct  355  ISEVLQQSSTDS---TTPQGNMAGHGISVGKEGGF--SKFFEEHGYVIGLMSVIPKTSYS  409

Query  119  TITPDYLS-YRGADYFNPIYNDIGYQDI  145
               P + S +   DYF P +  IG Q +
Sbjct  410  QGIPRHFSKFDKFDYFWPQFEHIGEQPV  437


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 51.6 bits (122),  Expect = 2e-04, Method: Compositional matrix adjust.
 Identities = 43/186 (23%), Positives = 79/186 (42%), Gaps = 9/186 (5%)

Query  6    TIPQLAIASRLQEYKDLLGAGGSRYSDWLETFFA--SKIEHVDRPKLIFSASQTVNVQVI  63
            +I  L  ++ LQ + +     GSRY + + + F   S    + RP+ +      ++V  +
Sbjct  297  SINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEV  356

Query  64   MNQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRP-VYYWATITP  122
            +  S  ++ S    +   G +   N    R    YF E GY+I ++SIRP   Y   +  
Sbjct  357  LQTSATDSTSPQANMAGHGISAGVNHGFKR----YFEEHGYIIGIMSIRPRTGYQQGVPK  412

Query  123  DYLSYRGADYFNPIYNDIGYQDISISRI--ANCPPQIDNVAVQEPCFNEFRASYDEVLGQ  180
            D+  +   D++ P +  +G Q+I    +     P   +      P + E++ S +EV G 
Sbjct  413  DFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYSMNEVHGD  472

Query  181  LSSVFA  186
                 A
Sbjct  473  FRGNMA  478


>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
 gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338

 Score = 50.8 bits (120),  Expect = 4e-04, Method: Compositional matrix adjust.
 Identities = 46/158 (29%), Positives = 71/158 (45%), Gaps = 15/158 (9%)

Query  7    IPQLAIASRLQEYKDLLGAGGSRYSDWLETFFASKIE--HVDRPKLIFSASQTVN---VQ  61
            +P+L + +++Q + D L   G R  D   T + +K    +V++P  +     ++N   V+
Sbjct  54   VPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLGVWQASINPSNVR  113

Query  62   VIMNQSGLNNFSGTQP-LGQQGGAI-AFNDRLGRRQ-SYYFREPG--YMIDMLSIRPVYY  116
             + N S     SG    LGQ    +  + D  G     YY +EPG   +I ML   P Y 
Sbjct  114  AMANGSA----SGEDANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYS  169

Query  117  WATITPDYLSYRGADYFNPIYNDIGYQDISISRIANCP  154
               + PD  S    D FNP  N IG+Q +   R +  P
Sbjct  170  -QGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMP  206


>gi|599087433|gb|AHN52642.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=281

 Score = 49.3 bits (116),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 43/150 (29%), Positives = 69/150 (46%), Gaps = 11/150 (7%)

Query  6    TIPQLAIASRLQEYKDLLGAGGSRYSDWLETFFA--SKIEHVDRPKLIFSASQTVNVQVI  63
            TI  L  A ++Q++ + L  GGSRY++ L +FF   S    + RP+ + S ++ VNV  I
Sbjct  137  TINGLRTAFQMQKFYERLARGGSRYTEVLRSFFGVVSPDARLQRPEFLGSFTKMVNVNPI  196

Query  64   MNQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYY--FREPGYMIDMLSIRP-VYYWATI  120
               S  +N   T P   QG   A+     +   +   F E GY+   +  R  + Y   I
Sbjct  197  AQTSATDN---TSP---QGNLSAYGVTAAKFHGFTKSFVEHGYIFGFVCARADLTYQQGI  250

Query  121  TPDYLSYRGADYFNPIYNDIGYQDISISRI  150
               +L     D++ P +  +G Q I +  I
Sbjct  251  NKMWLRSTVYDFYWPTFAHLGEQAIELREI  280


>gi|575096056|emb|CDL66947.1| unnamed protein product [uncultured bacterium]
Length=570

 Score = 48.5 bits (114),  Expect = 0.002, Method: Compositional matrix adjust.
 Identities = 47/187 (25%), Positives = 85/187 (45%), Gaps = 10/187 (5%)

Query  6    TIPQLAIASRLQEYKDLLGAGGSRYSDWLETFFA--SKIEHVDRPKLIFSASQTVNVQVI  63
            TI QL +A ++Q++ +    GGSRY++ + +FF   S    + R + +      +N+  +
Sbjct  318  TINQLRMAFQIQKFYEKQARGGSRYTEVIRSFFGVTSPDARLQRSEYLGGNRIPININQV  377

Query  64   MNQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYY--FREPGYMIDMLSIRPVY-YWATI  120
            + QSG  + S T     QG  +  +        +   F E G++I ++  R  + Y   I
Sbjct  378  IQQSGTGSASTT----PQGTVVGMSQTTDTHSDFTKSFTEHGFIIGVMCARYDHTYQQGI  433

Query  121  TPDYLSYRGADYFNPIYNDIGYQDISISRI-ANCPPQIDNVAVQEPCFNEFRASYDEVLG  179
               +      DY+ P++++IG Q I    I A      D V   +  + E+R     V G
Sbjct  434  DRMWSRKDKFDYYWPVFSNIGEQAIKNKEIYAQGNATDDEVFGYQEAWAEYRYKPSRVTG  493

Query  180  QLSSVFA  186
            ++ S +A
Sbjct  494  EMRSSYA  500


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 47.8 bits (112),  Expect = 0.003, Method: Compositional matrix adjust.
 Identities = 41/185 (22%), Positives = 77/185 (42%), Gaps = 9/185 (5%)

Query  7    IPQLAIASRLQEYKDLLGAGGSRYSDWLETFFA--SKIEHVDRPKLIFSASQTVNVQVIM  64
            I  +  ++ LQ + +     GSRY + + + F   S    + RP+ +      ++V  ++
Sbjct  5    INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  64

Query  65   NQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRP-VYYWATITPD  123
              S  ++ S    +   G +   N    R    YF E GY++ ++SIRP   Y   +  D
Sbjct  65   QTSSTDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQQGVPKD  120

Query  124  YLSYRGADYFNPIYNDIGYQDISISRIANCPPQIDNVAV--QEPCFNEFRASYDEVLGQL  181
            +  +   D++ P +  +G Q+I    +        N       P + E++ S +EV G  
Sbjct  121  FRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYTPRYAEYKYSQNEVHGDF  180

Query  182  SSVFA  186
                A
Sbjct  181  RGNMA  185


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 46.6 bits (109),  Expect = 0.010, Method: Compositional matrix adjust.
 Identities = 41/185 (22%), Positives = 77/185 (42%), Gaps = 9/185 (5%)

Query  7    IPQLAIASRLQEYKDLLGAGGSRYSDWLETFFA--SKIEHVDRPKLIFSASQTVNVQVIM  64
            I  +  ++ LQ + +     GSRY + + + F   S    + RP+ +      ++V  ++
Sbjct  150  INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  209

Query  65   NQSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRP-VYYWATITPD  123
              S  ++ S    +   G +   N    R    YF E GY++ ++SIRP   Y   +  D
Sbjct  210  QTSSTDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQQGVPKD  265

Query  124  YLSYRGADYFNPIYNDIGYQDISISRIANCPPQIDNVAV--QEPCFNEFRASYDEVLGQL  181
            +  +   D++ P +  +G Q+I    +        N       P + E++ S +EV G  
Sbjct  266  FRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYTPRYAEYKYSQNEVHGDF  325

Query  182  SSVFA  186
                A
Sbjct  326  RGNMA  330


>gi|444297974|dbj|GAC77852.1| major capsid protein [uncultured marine virus]
Length=202

 Score = 45.4 bits (106),  Expect = 0.014, Method: Compositional matrix adjust.
 Identities = 40/153 (26%), Positives = 69/153 (45%), Gaps = 9/153 (6%)

Query  1    MAGVQTIPQLAI-----ASRLQEYKDLLGAGGSRYSDWLETF-FASKIEHVDRPKLIFSA  54
            +A + T  Q+++     A  LQ Y++     GSRYS++L      S    + RP+ +   
Sbjct  18   VADLSTAEQISVNDFRRAFALQRYQEARSKYGSRYSEYLRFLGIKSSDARLQRPEFLGGG  77

Query  55   SQTVNVQVIMN-QSGLNNFSGTQPLGQQGGAIAFNDRLGRRQSYYFREPGYMIDMLSIRP  113
              T+N   ++N  S ++  +    LG  GG    + R  + +  +F E G++I ++S+RP
Sbjct  78   KTTINFSEVLNTTSSVDGATPVDDLGHMGGHGIASMRSNKYRR-FFEEHGHVITLMSVRP  136

Query  114  -VYYWATITPDYLSYRGADYFNPIYNDIGYQDI  145
               Y   +   +      DYF      IG Q I
Sbjct  137  KTMYSNGLHRKFSRTTKEDYFQRELQYIGQQPI  169



Lambda      K        H        a         alpha
   0.319    0.135    0.400    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 1244344362366