bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-42_CDS_annotation_glimmer3.pl_2_2

Length=496
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|444298142|dbj|GAC77768.1|  major capsid protein                    56.2    2e-05
gi|547312923|ref|WP_022044635.1|  putative uncharacterized protein    55.8    3e-05
gi|639237429|ref|WP_024568106.1|  hypothetical protein                53.5    3e-04
gi|649557305|gb|KDS63784.1|  capsid family protein                    51.2    6e-04
gi|444298000|dbj|GAC77839.1|  major capsid protein                    52.0    7e-04
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          51.6    0.001
gi|444297994|dbj|GAC77842.1|  major capsid protein                    49.3    0.001
gi|649569140|gb|KDS75238.1|  capsid family protein                    50.8    0.001
gi|492501782|ref|WP_005867318.1|  hypothetical protein                51.2    0.001
gi|649555287|gb|KDS61824.1|  capsid family protein                    50.8    0.002


>gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus]
Length=299

 Score = 56.2 bits (134),  Expect = 2e-05, Method: Compositional matrix adjust.
 Identities = 43/153 (28%), Positives = 73/153 (48%), Gaps = 9/153 (6%)

Query  224  VSMSGVTTIPQLAIASRLQEYKDLLGAGGNRYSDWLDTF-FASKIEHVDRPKLLFSASQT  282
            +S +G   I  L  A  LQ Y++     G R++++L     +S    + RP+++ +    
Sbjct  113  LSQAGAININDLREAFALQRYQEARNLYGARFTEYLRYLGISSSXGRLQRPEMISTGKSN  172

Query  283  INVQVIMNQAGPNNFSGLDISGPLGQQGG-AIAFNEQLGRRQSYYFSEPGYLIDMLSIRP  341
            IN   ++N  GP   SG+D   PLG+ GG  IA  +    R  Y+  E G++I ++S+RP
Sbjct  173  INFSEVLNTTGP---SGVD-DHPLGEMGGHGIAGVK--SNRARYFCEEHGHIISLMSVRP  226

Query  342  -VYYWSFVKPDYLNYLGPDYFNPIYNDIGYQDV  373
               Y +     +      DY+      IG ++V
Sbjct  227  KTIYMTTQHKQFDRESKEDYWQKELQAIGMEEV  259


>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
 gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338

 Score = 55.8 bits (133),  Expect = 3e-05, Method: Compositional matrix adjust.
 Identities = 57/193 (30%), Positives = 83/193 (43%), Gaps = 26/193 (13%)

Query  205  VPSNPDRFSRLIPTGSTSAVSMSGV------------TTIPQLAIASRLQEYKDLLGAGG  252
            VP +PD F  +I  GS+ AV +  +              +P+L + +++Q + D L   G
Sbjct  15   VPYSPDLFGNIIKQGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSG  74

Query  253  NRYSDWLDTFFASKIE--HVDRPKLLFSASQTINVQVIMNQAGPNNFSGLDISGPLGQQG  310
             R  D   T + +K    +V++P  L     +IN   +   A   + SG D +  LGQ  
Sbjct  75   GRVGDVFRTLWGTKSSAIYVNKPDFLGVWQASINPSNVRAMAN-GSASGEDAN--LGQLA  131

Query  311  GAI----AFNEQLGRRQSYYFSEPG--YLIDMLSIRPVYYWSFVKPDYLNYLGPDYFNPI  364
              +     F+   G    YY  EPG   LI ML   P Y    + PD  +    D FNP 
Sbjct  132  ACVDRYCDFSGHSG--IDYYAKEPGTFMLITMLVPEPAYSQG-LHPDLASISFGDDFNPE  188

Query  365  YNDIGYQDVSSAR  377
             N IG+Q V   R
Sbjct  189  LNGIGFQLVPRHR  201


>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546

 Score = 53.5 bits (127),  Expect = 3e-04, Method: Compositional matrix adjust.
 Identities = 50/188 (27%), Positives = 87/188 (46%), Gaps = 16/188 (9%)

Query  230  TTIPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFASKIE--HVDRPKLLFSASQTINVQV  287
            +TI  L  A +LQE+ +     G+RY++ + +FF  K     + RP+ L      I +  
Sbjct  298  STINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKTPILISE  357

Query  288  IMNQAGPNNFSGLDISGPLGQQGG-AIAFNEQLGRRQSYYFSEPGYLIDMLSIRPVYYWS  346
            ++ Q      S  D + P G   G  I+  ++ G   S +F E GY+I ++S+ P   +S
Sbjct  358  VLQQ------SSTDSTTPQGNMAGHGISVGKEGGF--SKFFEEHGYVIGLMSVIPKTSYS  409

Query  347  FVKPDYLNYLGP-DYFNPIYNDIGYQDVSSARI----VFNGNAGATSASEPCFNEFRASY  401
               P + +     DYF P +  IG Q V +  I    V + ++G      P ++E++ S 
Sbjct  410  QGIPRHFSKFDKFDYFWPQFEHIGEQPVYNKEIFAKNVGDYDSGGVFGYVPRYSEYKYSP  469

Query  402  DEVLGQLQ  409
              + G  +
Sbjct  470  STIHGDFK  477


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 51.2 bits (121),  Expect = 6e-04, Method: Compositional matrix adjust.
 Identities = 44/188 (23%), Positives = 82/188 (44%), Gaps = 19/188 (10%)

Query  232  IPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFA--SKIEHVDRPKLLFSASQTINVQVIM  289
            I  +  ++ LQ + +     G+RY + + + F   S    + RP+ L      I+V  ++
Sbjct  5    INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  64

Query  290  NQAGPNNFSGLDISGP---LGQQGGAIAFNEQLGRRQSYYFSEPGYLIDMLSIRP-VYYW  345
                    S  D + P   +   G +   N    R    YF E GY++ ++SIRP   Y 
Sbjct  65   QT------SSTDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQ  114

Query  346  SFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARIVFNGNAGATSAS---EPCFNEFRASYD  402
              V  D+  +   D++ P +  +G Q++ +  +  N +  A   +    P + E++ S +
Sbjct  115  QGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYTPRYAEYKYSQN  174

Query  403  EVLGQLQG  410
            EV G  +G
Sbjct  175  EVHGDFRG  182


>gi|444298000|dbj|GAC77839.1| major capsid protein [uncultured marine virus]
Length=480

 Score = 52.0 bits (123),  Expect = 7e-04, Method: Compositional matrix adjust.
 Identities = 58/273 (21%), Positives = 107/273 (39%), Gaps = 40/273 (15%)

Query  231  TIPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFAS-KIEHVDRPKLLFSASQTINVQVIM  289
            TI  +  A  +Q Y++     G+RY+++L     + K   + RP+ +   +  IN   ++
Sbjct  237  TINDIRRAFAIQRYQEARSRYGSRYTEYLRYLGVNPKDARLQRPEYMGGGTTQINFSEVL  296

Query  290  NQA----GPNNFSGLDISGPLGQQGGAIAFNEQLGRRQSYYFSEPGYLIDMLSIRP-VYY  344
              +    G +  S   +    G   G  A      RR   Y  E GY+I MLS+RP   Y
Sbjct  297  QTSPEIPGEDQVSQFGVGDMYGH--GIAAMRSNKYRR---YIEEHGYIISMLSVRPKTMY  351

Query  345  WSFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARIVFNGNAGA-TSASEPCFNEFRASYDE  403
             + +   +L     DY+      IG Q++ +  I  +  AG  T      ++E+R +   
Sbjct  352  TNGIHRSWLRLTKEDYYQKELEHIGQQEIMNNEIYADEGAGTETFGYNDRYSEYRETPSH  411

Query  404  VLGQLQGYPMPEVDGTSYIPLYAYWVQQRTSKLSDGSGSLPESHYYPIL---FTDMNQVN  460
            V  + +G             +  YW   R            E    P+L   F D +   
Sbjct  412  VSAEFRG-------------ILNYWHMAR------------EFEAPPVLNQSFVDCDATK  446

Query  461  SPFASKVEDNFFVNMSYAVQKKSLVNKTFATRL  493
                 + +D  ++ + + +  + L+++  A R+
Sbjct  447  RIHNEQTQDALWIMIQHKMVARRLLSRNAAPRI  479


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 51.6 bits (122),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 52/190 (27%), Positives = 85/190 (45%), Gaps = 18/190 (9%)

Query  229  VTTIPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFASKIE--HVDRPKLLFSASQTINVQ  286
            V+T+  L  A +LQE+ +     G+RY++ + +FF  K     + RP+ L      I + 
Sbjct  288  VSTVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMIS  347

Query  287  VIMNQAGPNNFSGLDISGPLGQQGG-AIAFNEQLGRRQSYYFSEPGYLIDMLSIRPVYYW  345
             ++ Q      S  D + P G   G  I   +  G   S +F E GY+I ++S+ P   +
Sbjct  348  EVLQQ------SATDSTTPQGNMAGHGIGIGKDGGF--SRFFEEHGYVIGLMSVIPKTSY  399

Query  346  SFVKPDYLNYLGP-DYFNPIYNDIGYQDVSSARIVFNGNAGATSASE-----PCFNEFRA  399
            S   P + +     DYF P +  IG Q V +  I F  N  A  +       P ++E++ 
Sbjct  400  SQGIPRHFSKSDKFDYFWPQFEHIGEQPVYNKEI-FAKNIDAFDSEAVFGYLPRYSEYKF  458

Query  400  SYDEVLGQLQ  409
            S   V G  +
Sbjct  459  SPSTVHGDFK  468


>gi|444297994|dbj|GAC77842.1| major capsid protein, partial [uncultured marine virus]
Length=183

 Score = 49.3 bits (116),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 44/164 (27%), Positives = 72/164 (44%), Gaps = 12/164 (7%)

Query  220  STSAVSMSGVT--TIPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFASKIEHVD--RPKL  275
            ST    +SG T  TI  + IA  +Q   +    GG RY + +   F  + + +   RP+L
Sbjct  25   STGTADLSGATSATINDMRIAVTMQHLLERDARGGTRYREQVLAHFGVQTDDIRLMRPEL  84

Query  276  LFSASQTINVQVIMNQAGPNNFSGLDISGPLGQQGGAIAFNEQLGRRQSYYFSEPGYLID  335
            L + S T+N+  +   A           GP+  +  A A    +GR  +  F+E G ++ 
Sbjct  85   LATGSTTVNLSPVATTAAFG-------VGPVLGELAAFATASAVGRGWTKRFNEHGIVMG  137

Query  336  MLSIR-PVYYWSFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARI  378
            + SIR  + Y   +   +      DY+ P  + +G Q V S  I
Sbjct  138  LCSIRAELSYQQGMARRFARTSRLDYYWPELSHLGEQVVQSREI  181


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 50.8 bits (120),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 44/188 (23%), Positives = 82/188 (44%), Gaps = 19/188 (10%)

Query  232  IPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFA--SKIEHVDRPKLLFSASQTINVQVIM  289
            I  +  ++ LQ + +     G+RY + + + F   S    + RP+ L      I+V  ++
Sbjct  150  INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  209

Query  290  NQAGPNNFSGLDISGP---LGQQGGAIAFNEQLGRRQSYYFSEPGYLIDMLSIRP-VYYW  345
                    S  D + P   +   G +   N    R    YF E GY++ ++SIRP   Y 
Sbjct  210  QT------SSTDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQ  259

Query  346  SFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARIVFNGNAGATSAS---EPCFNEFRASYD  402
              V  D+  +   D++ P +  +G Q++ +  +  N +  A   +    P + E++ S +
Sbjct  260  QGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYTPRYAEYKYSQN  319

Query  403  EVLGQLQG  410
            EV G  +G
Sbjct  320  EVHGDFRG  327


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 51.2 bits (121),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 44/190 (23%), Positives = 82/190 (43%), Gaps = 19/190 (10%)

Query  230  TTIPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFA--SKIEHVDRPKLLFSASQTINVQV  287
             +I  L  ++ LQ + +     G+RY + + + F   S    + RP+ L      I+V  
Sbjct  296  VSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE  355

Query  288  IMNQAGPNNFSGLDISGP---LGQQGGAIAFNEQLGRRQSYYFSEPGYLIDMLSIRP-VY  343
            ++        S  D + P   +   G +   N    R    YF E GY+I ++SIRP   
Sbjct  356  VLQT------SATDSTSPQANMAGHGISAGVNHGFKR----YFEEHGYIIGIMSIRPRTG  405

Query  344  YWSFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARIVFNGNAGATSAS---EPCFNEFRAS  400
            Y   V  D+  +   D++ P +  +G Q++ +  +       + + +    P + E++ S
Sbjct  406  YQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYS  465

Query  401  YDEVLGQLQG  410
             +EV G  +G
Sbjct  466  MNEVHGDFRG  475


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 50.8 bits (120),  Expect = 0.002, Method: Compositional matrix adjust.
 Identities = 44/188 (23%), Positives = 82/188 (44%), Gaps = 19/188 (10%)

Query  232  IPQLAIASRLQEYKDLLGAGGNRYSDWLDTFFA--SKIEHVDRPKLLFSASQTINVQVIM  289
            I  +  ++ LQ + +     G+RY + + + F   S    + RP+ L      I+V  ++
Sbjct  301  INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  360

Query  290  NQAGPNNFSGLDISGP---LGQQGGAIAFNEQLGRRQSYYFSEPGYLIDMLSIRP-VYYW  345
                    S  D + P   +   G +   N    R    YF E GY++ ++SIRP   Y 
Sbjct  361  QT------SSTDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQ  410

Query  346  SFVKPDYLNYLGPDYFNPIYNDIGYQDVSSARIVFNGNAGATSAS---EPCFNEFRASYD  402
              V  D+  +   D++ P +  +G Q++ +  +  N +  A   +    P + E++ S +
Sbjct  411  QGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYTPRYAEYKYSQN  470

Query  403  EVLGQLQG  410
            EV G  +G
Sbjct  471  EVHGDFRG  478



Lambda      K        H        a         alpha
   0.318    0.133    0.403    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 3479077307112