bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-4_CDS_annotation_glimmer3.pl_2_2

Length=598
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|496050829|ref|WP_008775336.1|  hypothetical protein                  401   2e-128
gi|490418709|ref|WP_004291032.1|  hypothetical protein                  400   5e-128
gi|575094354|emb|CDL65742.1|  unnamed protein product                   394   3e-125
gi|547226430|ref|WP_021963493.1|  putative uncharacterized protein      390   2e-124
gi|494822885|ref|WP_007558293.1|  hypothetical protein                  351   1e-108
gi|575094321|emb|CDL65708.1|  unnamed protein product                   241   7e-67
gi|494308783|ref|WP_007173938.1|  hypothetical protein                  200   8e-53
gi|496521299|ref|WP_009229582.1|  capsid protein                        199   1e-52
gi|494306153|ref|WP_007173049.1|  hypothetical protein                  173   1e-43
gi|517172762|ref|WP_018361580.1|  hypothetical protein                  157   4e-38


>gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4]
 gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4]
Length=580

 Score =   401 bits (1030),  Expect = 2e-128, Method: Compositional matrix adjust.
 Identities = 239/620 (39%), Positives = 352/620 (57%), Gaps = 65/620 (10%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ SLK +RN   R+ FDLSSK  F+AK GELLP+K +  +PGDK+++  + FTRTQP+N
Sbjct  3    NIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEVLPGDKWSIDLKSFTRTQPLN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGS--QTSSLTLGNYLPTISSS  119
            T+A+ R+REYYD+++VP +LLW  A  V++QM  N QHA S   +++  L   +P ++  
Sbjct  63   TAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHATSYIPSANQALAGVMPNVTCK  122

Query  120  QLSA--------VCSRLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDT  171
             ++         V +    +KNYFGY RS  + KL++YL  GN       F T   + + 
Sbjct  123  GIADYLNLVAPDVTTTNSYEKNYFGYSRSLGTAKLLEYLGYGN-------FYTYATSKNN  175

Query  172  SYTQA-YRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpv  230
            ++T++    NL L+++  LAY+K   D+ R SQW+  SP  +N+DY +G      +   +
Sbjct  176  TWTKSPLSSNLQLNIYGVLAYQKIYADHIRDSQWEKVSPSCFNVDYLSGTVDSAMTIDSM  235

Query  231  ssD----PYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWa  286
             +     P++N   +FDL YCNW KD+F GV P  Q+GD A + +   +  S+       
Sbjct  236  ITGQGFAPFYN---MFDLRYCNWQKDLFHGVLPRQQYGDTAAVNVNLSNVLSA-------  285

Query  287  sgspsskapvvvgaaasspNFTIRAESGN---MNPANILGVDTSSLSLAGSFDVLALRRG  343
                                + ++   G+    +P +  GV+  +++ +G+F VLALR+ 
Sbjct  286  -------------------QYMVQTPDGDPVGGSPFSSTGVNLQTVNGSGTFTVLALRQA  326

Query  344  EALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSG  403
            E LQ+WKEI+ +  ++Y+ QI+ H+ V VGE  S MS Y+GG ++SLDI+EVVN N+   
Sbjct  327  EFLQKWKEITQSGNKDYKDQIEKHWNVSVGEAYSEMSLYLGGTTASLDINEVVNNNITGS  386

Query  404  DVASEAVIAGKGVGSSQGSEKFEARD-WGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDL  462
            + A    IAGKGV    G   F+A + +G++MCIYH++PLLDY +   +P F    +TD 
Sbjct  387  NAAD---IAGKGVVVGNGRISFDAGERYGLIMCIYHSLPLLDYTTDLVNPAFTKINSTDF  443

Query  463  PIPELDSIGMQSVPLAMYTNSDGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTE  522
             IPE D +GM+SVPL    N    L S +      +GY PRY S+KT  D  +GAF TT 
Sbjct  444  AIPEFDRVGMESVPLVSLMN---PLQSSYNVGSSILGYAPRYISYKTDVDSSVGAFKTTL  500

Query  523  KEWVAPISSALWKNMLSTITVRNPQ----FTYNFFKVNPSVLDSIFQVNADSKWDTDPFL  578
            K WV    +    N L+     N        Y  FKVNP+ +D +F V A +  DTD FL
Sbjct  501  KSWVMSYDNQSVINQLNYQDDPNNSPGTLVNYTNFKVNPNCVDPLFAVAASNSIDTDQFL  560

Query  579  INCAFDVKVVRNLDYSGMPY  598
             +  FDVKVVRNLD  G+PY
Sbjct  561  CSSFFDVKVVRNLDTDGLPY  580


>gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii]
 gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 
20697]
Length=578

 Score =   400 bits (1027),  Expect = 5e-128, Method: Compositional matrix adjust.
 Identities = 245/624 (39%), Positives = 337/624 (54%), Gaps = 75/624 (12%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ SLK IRN P R+ FDLS K  F+AK+GELLP+     +PGD F +  + FTRTQPVN
Sbjct  3    NIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQPVN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGS--QTSSLTLGNYLPTISSS  119
            T+A+ RIREYYD+F+VP  LLW  A  V++QM  N QHA S   T +  L   +P ++S 
Sbjct  63   TAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDNPQHAVSIDPTRNFVLSGEMPYMTSE  122

Query  120  QLSAVCSRLFG-------KKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTS  172
             +++  + L         K NYFGY+RS  S KL++YL  GN              +D  
Sbjct  123  AIASYINALSTASALADYKSNYFGYNRSKSSVKLLEYLGYGNYESF---------LTDDW  173

Query  173  YTQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvss  232
             T     NL+ ++F  LAY+K   D++R SQW+  SP  +N+DY  G S +L ++     
Sbjct  174  NTAPLMANLNHNIFGLLAYQKIYSDFYRDSQWERVSPSTFNVDYLDGSSMNLDNAYSTE-  232

Query  233  DPYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspss  292
              ++ N   FDL YCNW KD+F GV P  Q+G+ A   IT D     L L          
Sbjct  233  --FYQNYNFFDLRYCNWQKDLFHGVLPHQQYGETAVASITPDV-TGKLTLS---------  280

Query  293  kapvvvgaaasspNFTIRAESGNMNPANILGVDTSSL---SLAGSFDVLALRRGEALQRW  349
                         NF+    S    P    G  T +L      G   +L LR+ E LQ+W
Sbjct  281  -------------NFSTVGTS----PTTASGTATKNLPAFDTVGDLSILVLRQAEFLQKW  323

Query  350  KEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASEA  409
            KEI+ +  ++Y+ Q++ H+GV VG+  S + TY+GG SSS+DI+EV+NTN+ +G  A++ 
Sbjct  324  KEITQSGNKDYKDQLEKHWGVSVGDGFSELCTYLGGVSSSIDINEVINTNI-TGSAAAD-  381

Query  410  VIAGKGVGSSQGSEKFEARD-WGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPELD  468
             IAGKGVG + G   F +   +G++MCIYH +PLLDY +   DP F    +TD  IPE D
Sbjct  382  -IAGKGVGVANGEINFNSNGRYGLIMCIYHCLPLLDYTTDMLDPAFLKVNSTDYAIPEFD  440

Query  469  SIGMQSVPLAMYTNSDGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTEKEWVAP  528
             +GMQS+PL    N    L S   +    +GY+PRY  +KTS D  +G F  T   WV  
Sbjct  441  RVGMQSMPLVQLMN---PLRSFANASGLVLGYVPRYIDYKTSVDQSVGGFKRTLNSWVIS  497

Query  529  ISSALWKNMLSTITVRN--------------PQFTYNFFKVNPSVLDSIFQVNADSKWDT  574
              +    ++L  +T+ N                  + FFKVNP  LD IF V A    +T
Sbjct  498  YGNI---SVLKQVTLPNDAPPIEPSEPVPSVAPMNFTFFKVNPDCLDPIFAVQAGDDTNT  554

Query  575  DPFLINCAFDVKVVRNLDYSGMPY  598
            D FL +  FD+K VRNLD  G+PY
Sbjct  555  DQFLCSSFFDIKAVRNLDTDGLPY  578


>gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium]
Length=615

 Score =   394 bits (1012),  Expect = 3e-125, Method: Compositional matrix adjust.
 Identities = 243/638 (38%), Positives = 351/638 (55%), Gaps = 68/638 (11%)

Query  5    SLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSA  64
            S+ DI+N P R+ FDLS K  F+AK+GELLP+     +PGD F +  + FTRTQP+NTSA
Sbjct  2    SMADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTSA  61

Query  65   YTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQT--SSLTLGNYLPTISSSQLS  122
            + R+REYYD+++VP   +W      I+QM +NVQHA   T   +  L   +P  +S Q++
Sbjct  62   FARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFTSEQIA  121

Query  123  AVCS--RLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRFN  180
               +      +KN FG++RS L+ KL+QYL  G+       + +    ++T   +   +N
Sbjct  122  DYLNDQATAARKNPFGFNRSTLTCKLLQYLGYGD-------YNSFDSETNTWSAKPLLYN  174

Query  181  LDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNNT  240
            L+LS FP LAY+K   D++RY+QW+ ++P  +N+DY  G S        + SD    +N 
Sbjct  175  LELSPFPLLAYQKIYSDFYRYTQWEKTNPSTFNLDYIKGTSDLQMDLTGLPSD----DNN  230

Query  241  LFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSP-ESSLQLKAWasgspsskapvvvg  299
             FD+ YCN+ KDMF GV P      VA  G  S  P    L + +     P  K      
Sbjct  231  FFDIRYCNYQKDMFHGVLP------VAQYGSASVVPINGQLNVISNGDSGPIFKTSTPDP  284

Query  300  aaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFD-----------------------  336
                +   T+    G  N +   GV  S+L++  S D                       
Sbjct  285  GTPGTSYVTVGGNIGVDNRS--FGVSGSTLNVGKSADPSGYGFPSNASTRSLLWENPNLI  342

Query  337  ----------VLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGD  386
                      +LALR+ E LQ+WKE+S++  ++Y++QI+ H+G+ V + +S  + Y+GG 
Sbjct  343  IENNQGFYVPILALRQAEFLQKWKEVSVSGEEDYKSQIEKHWGIKVSDFLSHQARYLGGC  402

Query  387  SSSLDISEVVNTNLQSGDVASEAVIAGKGVGSSQGSEKFEAR-DWGVLMCIYHNVPLLDY  445
            ++SLDI+EV+N N+ +GD A++  IAGKG  +  GS +FE++ ++G++MCIYH +P++DY
Sbjct  403  ATSLDINEVINNNI-TGDNAAD--IAGKGTFTGNGSIRFESKGEYGIIMCIYHVLPIVDY  459

Query  446  VSSAPDPQFFVTQNTDLPIPELDSIGMQSVPLAMYTNSDGELVSGFVSPDYTMGYLPRYF  505
            V S  D    +   T  PIPELD IGM+SVPL    N   E  S   S D  +GY PRY 
Sbjct  460  VGSGVDHSCTLVDATSFPIPELDQIGMESVPLVRAMNPVKE--SDTPSADTFLGYAPRYI  517

Query  506  SWKTSYDYVLGAFTTTEKEWVAPI-----SSALWKNMLSTITVRNPQFTYNFFKVNPSVL  560
             WKTS D  +G F  + + W  P+     +SA   N  S   V        FFKVNPS++
Sbjct  518  DWKTSVDRSVGDFADSLRTWCLPVGDKELTSANSLNFPSNPNVEPDSIAAGFFKVNPSIV  577

Query  561  DSIFQVNADSKWDTDPFLINCAFDVKVVRNLDYSGMPY  598
            D +F V ADS   TD FL +  FDVKVVRNLD +G+PY
Sbjct  578  DPLFAVVADSTVKTDEFLCSSFFDVKVVRNLDVNGLPY  615


>gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
 gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
Length=573

 Score =   390 bits (1002),  Expect = 2e-124, Method: Compositional matrix adjust.
 Identities = 243/617 (39%), Positives = 345/617 (56%), Gaps = 66/617 (11%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            S+ SL  ++N  +R+ FDLS K AF+AK GELLPI      PGDKF ++ Q FTRTQPVN
Sbjct  3    SVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQPVN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLG-----------  110
            ++AY+R+REYYD+++VP  LLW  AP   + M  +  HA    SS+ L            
Sbjct  63   SAAYSRLREYYDFYFVPYRLLWNMAPTFFTNM-PDPHHAADLVSSVNLSQRHPWFTFFDI  121

Query  111  -NYLPTISSSQLSAVCSRLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPAS  169
              YL  ++S  LS    +   +KN+FG+ R +LS KL+ YL  G       ++ +    S
Sbjct  122  MEYLGNLNS--LSGAYEKY--QKNFFGFSRVELSVKLLNYLNYG----FGKDYESVKVPS  173

Query  170  DTSYTQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslp  229
            D+        ++ LS FP LAY+K C+DYFR  QWQ ++PY +N+DY  G SS     + 
Sbjct  174  DSD-------DIVLSPFPLLAYQKICEDYFRDDQWQSAAPYRYNLDYLYGKSSGFHIPMS  226

Query  230  vssDPYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVAT----IGITSDSPESSLQLKAW  285
              ++  + N T+FDL YCN+ KD F G+ P  Q+GDV+      G       SSL     
Sbjct  227  SFTNDAFKNPTMFDLNYCNFQKDYFTGMLPRAQYGDVSVASPIFGDLDIGDSSSLTFA--  284

Query  286  asgspsskapvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEA  345
                                  +   +  N   + +L V+ +S + AG   VLALR+ E 
Sbjct  285  ----------------------SAPQQGANTIQSGVLVVNNNSNTTAG-LSVLALRQAEC  321

Query  346  LQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDV  405
            LQ+W+EI+ +   +Y+ Q++ HF V     +SG   Y+GG +S+LDISEVVNTNL +GD 
Sbjct  322  LQKWREIAQSGKMDYQTQMQKHFNVSPSATLSGHCKYLGGWTSNLDISEVVNTNL-TGD-  379

Query  406  ASEAVIAGKGVGSSQGSE-KFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPI  464
             ++A I GKG G+  G++  FE+ + G++MCIYH +PLLD+  +    Q F T  TD  I
Sbjct  380  -NQADIQGKGTGTLNGNKVDFESSEHGIIMCIYHCLPLLDWSINRIARQNFKTTFTDYAI  438

Query  465  PELDSIGMQSVPLAMYTNSDGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTEKE  524
            PE DS+GMQ +  +       +L S   S +  MGY+PRY   KTS D + G+F  T   
Sbjct  439  PEFDSVGMQQLYPSEMIFGLEDLPSDPSSIN--MGYVPRYADLKTSIDEIHGSFIDTLVS  496

Query  525  WVAPISSA---LWKNMLSTITVRNPQFTYNFFKVNPSVLDSIFQVNADSKWDTDPFLINC  581
            WV+P++ +    ++         +   TYNFFKVNP ++D+IF V ADS  +TD  LIN 
Sbjct  497  WVSPLTDSYISAYRQACKDAGFSDITMTYNFFKVNPHIVDNIFGVKADSTINTDQLLINS  556

Query  582  AFDVKVVRNLDYSGMPY  598
             FD+K VRN DY+G+PY
Sbjct  557  YFDIKAVRNFDYNGLPY  573


>gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius]
 gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 
17135]
Length=613

 Score =   351 bits (900),  Expect = 1e-108, Method: Compositional matrix adjust.
 Identities = 220/621 (35%), Positives = 338/621 (54%), Gaps = 41/621 (7%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ S+K +RN P R+ +DL+ K+ F+AK+G L+P+ W   +P D      + F RTQP+N
Sbjct  10   NIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQPLN  69

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQT--SSLTLGNYLPTISSS  119
            T+A+ R+R Y+D+++VP   +W   P  I+QM++N+ HA       ++ L + LP  ++ 
Sbjct  70   TAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNVPLSDELPYFTAE  129

Query  120  QLSAVCSRLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRF  179
            Q++     L   KN FGY R+ L   +++YL  G+     V       A  T  T+    
Sbjct  130  QVADYIVSLADSKNQFGYYRAWLVCIILEYLGYGDFYPYIVEAAGGEGA--TWATRPMLN  187

Query  180  NLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNN  239
            NL  S FP  AY+K   D+ RY+QW+ S+P  +NIDY +G  S     L  + + + ++ 
Sbjct  188  NLKFSPFPLFAYQKIYADFNRYTQWERSNPSTFNIDYISG--SADSLQLDFTVEGFKDSF  245

Query  240  TLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsskapvvvg  299
             LFD+ Y NW +D+  G  P  Q+G+ + + ++         ++     +P +      G
Sbjct  246  NLFDMRYSNWQRDLLHGTIPQAQYGEASAVPVSG-------SMQVVEGPTPPAFTTGQDG  298

Query  300  aaasspNFTIRAESGNMNPANILGVD--------TSSLSLAG--SFDV--LALRRGEALQ  347
             A  + N TI+  SG +     +G           S L + G  SF V  LALRR EA Q
Sbjct  299  VAFLNGNVTIQGSSGYLQAQTSVGESRILRFNNTNSGLIVEGDSSFGVSILALRRAEAAQ  358

Query  348  RWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVAS  407
            +WKE++L   ++Y +QI+AH+G  V +  S M  ++G  +  L I+EVVN N+ +G+ A+
Sbjct  359  KWKEVALASEEDYPSQIEAHWGQSVNKAYSDMCQWLGSINIDLSINEVVNNNI-TGENAA  417

Query  408  EAVIAGKGVGSSQGSEKFE-ARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPE  466
            +  IAGKG  S  GS  F     +G++MC++H +P LDY++SAP     +T   D PIPE
Sbjct  418  D--IAGKGTMSGNGSINFNVGGQYGIVMCVFHVLPQLDYITSAPHFGTTLTNVLDFPIPE  475

Query  467  LDSIGMQSVPLAMYTN----SDGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTE  522
             D IGM+ VP+    N     DG+     VSP+   GY P+Y++WKT+ D  +G F  + 
Sbjct  476  FDKIGMEQVPVIRGLNPVKPKDGDFK---VSPNLYFGYAPQYYNWKTTLDKSMGEFRRSL  532

Query  523  KEWVAPISSALWKNMLSTITVRNPQFTYN-----FFKVNPSVLDSIFQVNADSKWDTDPF  577
            K W+ P          S     NP    +     FFKV+PSVLD++F V A+S  +TD F
Sbjct  533  KTWIIPFDDEALLAADSVDFPDNPNVEADSVKAGFFKVSPSVLDNLFAVKANSDLNTDQF  592

Query  578  LINCAFDVKVVRNLDYSGMPY  598
            L +  FDV VVR+LD +G+PY
Sbjct  593  LCSTLFDVNVVRSLDPNGLPY  613


>gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium]
Length=642

 Score =   241 bits (614),  Expect = 7e-67, Method: Compositional matrix adjust.
 Identities = 198/645 (31%), Positives = 294/645 (46%), Gaps = 58/645 (9%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++  L  ++N P R++FDLS +  F+AK GELLP       PGD   +   +FTRT P+ 
Sbjct  6    NIMGLHGLKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPLQ  65

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAG-SQTSSLTLGNY-----LPT  115
            ++A+TR+RE   +F+VP   LW++    +  M  N      S+ +S  +GN      +P 
Sbjct  66   SNAFTRLRENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVTTQMPC  125

Query  116  ISSSQLSAVCSRLFGKKNYFGYD------------RSDLSYKLMQYLRVGNSGQVSVNFG  163
            ++   L A   + F  ++  G D            R   S KL+Q L  GN  +   NF 
Sbjct  126  VNYKTLHAYLLK-FINRSTVGSDGSVGPEFNRGCYRHAESAKLLQLLGYGNFPEQFANFK  184

Query  164  TSLPASDTSYTQ----AYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTG  219
             +    + S        Y  +  LS+F  LAY K C D++ Y QWQ  +  L N+DY T 
Sbjct  185  VNNDKHNQSGQNFKDVTYNNSPYLSIFRLLAYHKICNDHYLYRQWQPYNASLCNVDYLTP  244

Query  220  vsshlfsslpvssDP-----YWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSD  274
             SS L S                   L D+ + N   D F GV P +QFG  + + +   
Sbjct  245  NSSSLLSIDDALLSIPDDSIKAEKLNLLDMRFSNLPLDYFTGVLPTSQFGSESVVNLNLG  304

Query  275  SPESSLQLKAWasgspsskapvvvgaaasspNFTIRAESGNMNPANILGVDTS-------  327
            +   S  L    S                       + +GN+   N  G   S       
Sbjct  305  NASGSAVLNGTTSKDSGRWRTTTGEWEMEQR--VASSANGNLKLDNSNGTFISHDHTFSG  362

Query  328  ----SLSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYV  383
                + SL+G+  ++ALR   A Q++KEI L    ++++Q++AHFG+   E     S ++
Sbjct  363  NVAINTSLSGNLSIIALRNALAAQKYKEIQLANDVDFQSQVEAHFGIKPDEKNEN-SLFI  421

Query  384  GGDSSSLDISEVVNTNLQSGDVASEAVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLL  443
            GG SS ++I+E +N NL SGD  +    A +G GS+  S KF A+ +GV++ IY   P+L
Sbjct  422  GGSSSMININEQINQNL-SGDNKATYGAAPQGNGSA--SIKFTAKTYGVVIGIYRCTPVL  478

Query  444  DYVSSAPDPQFFVTQNTDLPIPELDSIGMQ-------SVPLAMYTNSDGELVSGFVSPDY  496
            D+     D   F T  +D  IPE+DSIGMQ       + P           V    SPD 
Sbjct  479  DFAHLGIDRTLFKTDASDFVIPEMDSIGMQQTFRCEVAAPAPYNDEFKAFRVGDGSSPDM  538

Query  497  --TMGYLPRYFSWKTSYDYVLGAFTTTEKEWVAPIS-SALWKNMLSTITVRNPQFTYNFF  553
              T GY PRY  +KTSYD   GAF  + K WV  I+  A+  N+ +T    N     N F
Sbjct  539  SETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGINFDAIQNNVWNTWAGINAP---NMF  595

Query  554  KVNPSVLDSIFQVNADSKWDTDPFLINCAFDVKVVRNLDYSGMPY  598
               P ++ ++F V++ +  D D   +         RNL   G+PY
Sbjct  596  ACRPDIVKNLFLVSSTNNSDDDQLYVGMVNMCYATRNLSRYGLPY  640


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score =   200 bits (508),  Expect = 8e-53, Method: Compositional matrix adjust.
 Identities = 174/619 (28%), Positives = 269/619 (43%), Gaps = 92/619 (15%)

Query  1    MSLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPV  60
            +S+  +K  R +  R+AFDLS +  F+A +G LLP+     +P D   +  Q F RT P+
Sbjct  3    VSIPKIKATRPNRNRNAFDLSQRHLFTAHAGMLLPVLNLDLIPHDHVEINAQDFMRTLPM  62

Query  61   NTSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLGNYLPTISSSQ  120
            NT+A+  +R  Y++F+VP H LW    + I+ M  N  H+ S   S+  G     +    
Sbjct  63   NTAAFASMRGVYEFFFVPYHQLWAQFDQFITGM--NDFHS-SANKSIQGGTSPLQVPYFN  119

Query  121  LSAVCSRLFGKKNYFGYDRSDLSYKL----MQYLRVGNSGQVSVNFGTSLPASDTSYTQA  176
            + +V + L   K        DL YK      + L +   G+   +FGT+ P + +     
Sbjct  120  VDSVFNSLNTGKESGSGSTDDLQYKFKYGAFRLLDLLGYGRKFDSFGTAYPDNVSGLKNN  179

Query  177  YRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYW  236
              +N   S+F  LAY K  QDY+R S +++     +N D F G                 
Sbjct  180  LDYNC--SVFRILAYNKIYQDYYRNSNYENFDTDSFNFDKFKGGLVDAKVVA--------  229

Query  237  NNNTLFDLEYCNWNKDMFMGVFPD------TQFGDVATIGITSDSPESSLQLKAWasgsp  290
                LF L Y N   D F  +         T F DV  I I   +P              
Sbjct  230  ---DLFKLRYRNAQTDYFTNLRQSQLFSFTTAFEDVDNINI---APRD------------  271

Query  291  sskapvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRWK  350
                              ++++  N    N  GVDT S    G F V +LR   A+ +  
Sbjct  272  -----------------YVKSDGSNFTRVN-FGVDTDSSE--GDFSVSSLRAAFAVDKLL  311

Query  351  EISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASE--  408
             +++   + ++ Q++AH+GV++ ++  G   Y+GG  S + +S+V  T   SG  A+E  
Sbjct  312  SVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDSDMQVSDVTQT---SGTTATEYK  368

Query  409  ------AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDL  462
                    +AGKG GS +G   F+A++ GVLMCIY  VP + Y  +  DP        D 
Sbjct  369  PEAGYLGRVAGKGTGSGRGRIVFDAKEHGVLMCIYSLVPQIQYDCTRLDPMVDKLDRFDY  428

Query  463  PIPELDSIGMQSVPLAMYTNSDGELVSGFVSPD---YTMGYLPRYFSWKTSYDYVLGAFT  519
              PE +++GMQ  PL      +   +S F + D     +GY PRY  +KT+ D   G F 
Sbjct  429  FTPEFENLGMQ--PL------NSSYISSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQFA  480

Query  520  TTEKEWVAPISS-ALWKNMLSTITVRNPQFTYNFFKVNPSVLDSIFQVNADSKWDTDPFL  578
             ++      +S    W           PQ     FK++P  L+SIF V+ +     D   
Sbjct  481  QSDALSSWSVSRFRRWTTF--------PQLEIADFKIDPGCLNSIFPVDYNGTEANDCVY  532

Query  579  INCAFDVKVVRNLDYSGMP  597
              C F++  V ++   GMP
Sbjct  533  GGCNFNIVKVSDMSVDGMP  551


>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
 gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 
317 str. F0108]
Length=541

 Score =   199 bits (506),  Expect = 1e-52, Method: Compositional matrix adjust.
 Identities = 184/615 (30%), Positives = 269/615 (44%), Gaps = 94/615 (15%)

Query  1    MSLFSLKDIR----NHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTR  56
            MSL  +  I+    N PR SAFDLS K  ++A +G LLP+     M  D   ++ Q F R
Sbjct  1    MSLKKVPQIKPSRANRPR-SAFDLSQKHLYTAPAGALLPVLSVDLMFHDHIRIQAQDFMR  59

Query  57   TQPVNTSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLGNYLPTI  116
            T P+N++A+  +R  Y++F+VP   LW    + I+ M     +  S  SS      L ++
Sbjct  60   TMPMNSAAFISMRGVYEFFFVPYSQLWHPYDQFITSMN---DYRSSVVSSAAGDKALDSV  116

Query  117  SSSQLSAVCS--RLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYT  174
             + +L+ +    R    K+ FGY  S+ S +LM  L           +G  + +S T   
Sbjct  117  PNVKLADMYKFVRERTDKDIFGYPHSNNSCRLMDLL----------GYGKPITSSKTPVP  166

Query  175  QAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDP  234
              Y  N++  LF  LAY K   DY+R + ++    Y +NID+  G               
Sbjct  167  LLYTGNVN--LFRLLAYNKIYSDYYRNTTYEGVDVYSFNIDHKKGTFVPTADEF------  218

Query  235  YWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsska  294
                    +L Y N   D +  + P   F    TIG  SDS  S LQL            
Sbjct  219  ----KKYLNLHYRNAPLDFYTNLRPTPLF----TIG--SDSFSSVLQLS-----------  257

Query  295  pvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRWKEISL  354
                     S  F+    S  +N A+               +V A+R   AL +   IS+
Sbjct  258  -----DPTGSAGFSADGNSAKLNMAS-----------PDVLNVSAIRSAFALDKLLSISM  301

Query  355  NVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASE------  408
               + Y  QI+AHFGV V E   G   Y+GG  S++ + +V  T+  +    SE      
Sbjct  302  RAGKTYAEQIEAHFGVTVSEGRDGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKL  361

Query  409  ----AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNT--DL  462
                  I GKG GS  G  +F+A++ GVLMCIY  VP + Y     DP  FV + T  D 
Sbjct  362  AGYLGKITGKGTGSGYGEIQFDAKEPGVLMCIYSVVPAMQYDCMRLDP--FVAKQTRGDY  419

Query  463  PIPELDSIGMQSVPLAMYTNSDGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTE  522
             IPE +++GMQ +  A         VS   + D + G+ PRY  +KT++D   G F   E
Sbjct  420  FIPEFENLGMQPIVPA--------FVSLNRAKDNSYGWQPRYSEYKTAFDINHGQFANGE  471

Query  523  KEWVAPISSALWKNMLSTITVRNPQFTYNFFKVNPSVLDSIFQVNADSKWDTDPFLINCA  582
                  I+ A   + L+T  V          K+NP  LDS+F VN +    TD       
Sbjct  472  PLSYWSIARARGSDTLNTFNVAA-------LKINPHWLDSVFAVNYNGTEVTDCMFGYAH  524

Query  583  FDVKVVRNLDYSGMP  597
            F+++ V ++   GMP
Sbjct  525  FNIEKVSDMTEDGMP  539


>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
 gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=519

 Score =   173 bits (438),  Expect = 1e-43, Method: Compositional matrix adjust.
 Identities = 156/580 (27%), Positives = 253/580 (44%), Gaps = 79/580 (14%)

Query  33   LLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSAYTRIREYYDWFWVPLHLLWRHAPEVISQ  92
            LLP+     +P D   +  Q F RT P+NT+A+  +R  Y++F+VP H LW    + I+ 
Sbjct  2    LLPVLNLDLIPHDHVEINAQDFMRTLPMNTAAFASMRGVYEFFFVPYHQLWAQFDQFITG  61

Query  93   MQSNVQHAGSQTSSLTLGNYLPTISSSQLSAVCSRLFGKKNYFGYDRSDLSYKL----MQ  148
            M  N  H+ S   S+  G     +    L +V   +  + +   + + DL Y+      +
Sbjct  62   M--NDFHS-SANKSIQGGTSPLQVPYFNLESVFKNIIERDSTPSF-QDDLQYRFKYGAFR  117

Query  149  YLRVGNSGQVSVNFGTSLPASDTSYTQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSS  208
             L +   G+   +FGT+ P + +       +N   S+F  LAY K  QDY+R S +++  
Sbjct  118  LLDLLGYGRKFDSFGTAYPDNVSGLKNNLDYNC--SVFRVLAYNKIYQDYYRNSNYENFD  175

Query  209  PYLWNIDYFTGvsshlfsslpvssDPYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVAT  268
               +N D F G                     LF L Y N   D F  +     F  +  
Sbjct  176  TDSFNFDKFKGGLVDAKVVA-----------DLFKLRYRNAQTDYFTNLRQSQLFTFIPE  224

Query  269  IGITSDSPESSLQLKAWasgspsskapvvvgaaasspNFTIRAESGNMNPANILGVDTSS  328
                SD    +     +A  S S+   +         NF +  ++               
Sbjct  225  F---SDDEHLNFDRDQYADQSKSNFTQL---------NFPVDVDNN--------------  258

Query  329  LSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSS  388
                G F V +LR   A+ +   +++   + ++ Q++AH+GV++ ++  G   Y+GG  S
Sbjct  259  ---LGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDS  315

Query  389  SLDISEVVNTNLQSGDVASE--------AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNV  440
             L +S+V  T   SG  A+E          IAGKG GS +G   F+A++ GVLMCIY  V
Sbjct  316  DLQVSDVTQT---SGTTATEYKPEAGYLGRIAGKGTGSGRGRIVFDAKEHGVLMCIYSLV  372

Query  441  PLLDYVSSAPDPQFFVTQNTDLPIPELDSIGMQSVPLAMYTNSDGELVSGFVSPD---YT  497
            P + Y  +  DP        D   PE +++GMQ  PL      +   +S F +PD     
Sbjct  373  PQIQYDCTRLDPMVDKLDRFDFFTPEFENLGMQ--PL------NSSYISSFCTPDPKNPV  424

Query  498  MGYLPRYFSWKTSYDYVLGAFTTTEKEWVAPISSALWKNMLSTITVRNPQFTYNFFKVNP  557
            +GY PRY  +KT+ D   G F   + + ++  S + ++   +      PQ     FK++P
Sbjct  425  LGYQPRYSEYKTALDINHGQF--AQNDALSSWSVSRFRRWTTF-----PQLEIADFKIDP  477

Query  558  SVLDSIFQVNADSKWDTDPFLINCAFDVKVVRNLDYSGMP  597
              L+S+F V  +    TD     C F++  V ++   GMP
Sbjct  478  GCLNSVFPVEFNGTESTDCVFGGCNFNIVKVSDMSVDGMP  517


>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568

 Score =   157 bits (398),  Expect = 4e-38, Method: Compositional matrix adjust.
 Identities = 150/598 (25%), Positives = 258/598 (43%), Gaps = 64/598 (11%)

Query  15   RSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSAYTRIREYYDW  74
            R+AFD+S +  F+A +G LLP+     +P D   +    F RT P+N++A+  +R  Y++
Sbjct  18   RNAFDISQRHLFTAPAGALLPVLSLDLLPHDHVEINASDFMRTLPMNSAAFMSMRGVYEF  77

Query  75   FWVPLHLLWRHAPEVISQM---QSNVQHAGSQTSSLTLGNYLPTISSSQLSAVC--SRLF  129
            ++VP   LW    + I+ M   +S+  +A         G   P+  S  +  +    +  
Sbjct  78   YFVPYKQLWSGFDQFITGMSDYKSSFMYAFK-------GKTPPSCVSFDVQKLVDWCKTN  130

Query  130  GKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRFNLDLSLFPFL  189
              K+  G+D++   Y+++  L  G     +      +P ++ + T   +     + F  L
Sbjct  131  TAKDIHGFDKNKGVYRILDLLGYGKYANSA-----GVPYTNPTSTTMGK----CTPFRGL  181

Query  190  AYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNNTLFDLEYCNW  249
            AY+K   D++R + +++     +N+D F G      +      D  W     F L Y N 
Sbjct  182  AYQKIYNDFYRNTTYEEYQLESFNVDMFYGSGKVKETIPNEPWDYDW-----FTLRYRNA  236

Query  250  NKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsskapvvvgaaasspNFTI  309
             KD+   V P   F       I   +P+        +         V  G      +  I
Sbjct  237  QKDLLTNVRPTPLFS------IDDFNPQF---FTGGSDIVMEKGPNVTGGTHEYRDSVVI  287

Query  310  RAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFG  369
              ++   N     GVD+    ++    V  +R   AL++   +++   + Y+ Q++AHFG
Sbjct  288  VGKNLKEN-----GVDSKRTMIS----VADIRNAFALEKLASVTMRAGKTYKEQMEAHFG  338

Query  370  VDVGENMSGMSTYVGGDSSSLDISEVVN----TNLQSGDVASEAVIA---GKGVGSSQGS  422
            + V E   G  TY+GG  S++ + +V      T   + D +    +    GK  GS  G 
Sbjct  339  ISVEEGRDGRCTYIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGH  398

Query  423  EKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPELDSIGMQSVPLAMYTN  482
             +F+A++ G+LMCIY  VP + Y S   DP     +  D  +PE +++GMQ  PL     
Sbjct  399  IRFDAKEHGILMCIYSLVPDVQYDSKRVDPFVQKIERGDFFVPEFENLGMQ--PLFAKNI  456

Query  483  S---DGELVSGFVSPDYTMGYLPRYFSWKTSYDYVLGAFTTTEKEWVAPISSALWKNMLS  539
            S   +    +  +      G+ PRY  +KT+ D   G F   E      ++ A  ++M  
Sbjct  457  SYKYNNNTANSRIKNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESM--  514

Query  540  TITVRNPQFTYNFFKVNPSVLDSIFQVNADSKWDTDPFLINCAFDVKVVRNLDYSGMP  597
                    F  + FK+NP  LD +F VN +    TD     C F++  V ++   GMP
Sbjct  515  ------SNFNISTFKINPKWLDDVFAVNYNGTELTDQVFGGCYFNIVKVSDMSIDGMP  566



Lambda      K        H        a         alpha
   0.319    0.133    0.413    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 4446915032268