bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-3_CDS_annotation_glimmer3.pl_2_3

Length=598
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|490418709|ref|WP_004291032.1|  hypothetical protein                  401   2e-128
gi|496050829|ref|WP_008775336.1|  hypothetical protein                  399   1e-127
gi|547226430|ref|WP_021963493.1|  putative uncharacterized protein      390   3e-124
gi|575094354|emb|CDL65742.1|  unnamed protein product                   387   1e-122
gi|494822885|ref|WP_007558293.1|  hypothetical protein                  347   3e-107
gi|575094321|emb|CDL65708.1|  unnamed protein product                   235   7e-65
gi|496521299|ref|WP_009229582.1|  capsid protein                        198   2e-52
gi|494308783|ref|WP_007173938.1|  hypothetical protein                  194   7e-51
gi|494306153|ref|WP_007173049.1|  hypothetical protein                  164   2e-40
gi|517172762|ref|WP_018361580.1|  hypothetical protein                  160   4e-39


>gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii]
 gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 
20697]
Length=578

 Score =   401 bits (1030),  Expect = 2e-128, Method: Compositional matrix adjust.
 Identities = 241/621 (39%), Positives = 338/621 (54%), Gaps = 69/621 (11%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ SLK IRN P R+ FDLS K  F+AK+GELLP+     +PGD F +  + FTRTQPVN
Sbjct  3    NIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQPVN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGS--QTSSLTLGNYLPTISSS  119
            T+A+ RIREYYD+F+VP  LLW  A  V++QM  N QHA S   T +  L   +P ++S 
Sbjct  63   TAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDNPQHAVSIDPTRNFVLSGEMPYMTSE  122

Query  120  QLSAVCSRLFG-------KKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTS  172
             +++  + L         K NYFGY+RS  S KL++YL  GN              +D  
Sbjct  123  AIASYINALSTASALADYKSNYFGYNRSKSSVKLLEYLGYGNYESF---------LTDDW  173

Query  173  YTQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvss  232
             T     NL+ ++F  LAY+K   D++R SQW+  SP  +N+DY  G S +L ++     
Sbjct  174  NTAPLMANLNHNIFGLLAYQKIYSDFYRDSQWERVSPSTFNVDYLDGSSMNLDNAYSTE-  232

Query  233  DPYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspss  292
              ++ N   FDL YCNW KD+F GV P  Q+G+ A   IT D     L L          
Sbjct  233  --FYQNYNFFDLRYCNWQKDLFHGVLPHQQYGETAVASITPDV-TGKLTLS---------  280

Query  293  kapvvvgaaasspNFTIRAESGNMNPANILGVDTSSL---SLAGSFDVLALRRGEALQRW  349
                         NF+    S    P    G  T +L      G   +L LR+ E LQ+W
Sbjct  281  -------------NFSTVGTS----PTTASGTATKNLPAFDTVGDLSILVLRQAEFLQKW  323

Query  350  KEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASEA  409
            KEI+ +  ++Y+ Q++ H+GV VG+  S + TY+GG SSS+DI+EV+NTN+ +G  A++ 
Sbjct  324  KEITQSGNKDYKDQLEKHWGVSVGDGFSELCTYLGGVSSSIDINEVINTNI-TGSAAAD-  381

Query  410  VIAGKGVGSSQGSEKFEARD-WGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPELD  468
             IAGKGVG + G   F +   +G++MCIYH +PLLDY +   DP F    +TD  IPE D
Sbjct  382  -IAGKGVGVANGEINFNSNGRYGLIMCIYHCLPLLDYTTDMLDPAFLKVNSTDYAIPEFD  440

Query  469  SIGMQSVPVSMYSNSDKELVTGFSSADFTMGYLPRYYSWKTSYDYVLGAFTTTEKEWVAP  528
             +GMQS+P+    N  +      +++   +GY+PRY  +KTS D  +G F  T   WV  
Sbjct  441  RVGMQSMPLVQLMNPLRSFA---NASGLVLGYVPRYIDYKTSVDQSVGGFKRTLNSWVIS  497

Query  529  ITSV-IWKRMLI----------GLTSSSGSFNYNFFKVNPSILDSIFQANANSKWDTDPF  577
              ++ + K++ +              S    N+ FFKVNP  LD IF   A    +TD F
Sbjct  498  YGNISVLKQVTLPNDAPPIEPSEPVPSVAPMNFTFFKVNPDCLDPIFAVQAGDDTNTDQF  557

Query  578  LINCAFDVKVVRNLDYSGMPY  598
            L +  FD+K VRNLD  G+PY
Sbjct  558  LCSSFFDIKAVRNLDTDGLPY  578


>gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4]
 gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4]
Length=580

 Score =   399 bits (1024),  Expect = 1e-127, Method: Compositional matrix adjust.
 Identities = 239/620 (39%), Positives = 357/620 (58%), Gaps = 65/620 (10%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ SLK +RN   R+ FDLSSK  F+AK GELLP+K +  +PGDK+++  + FTRTQP+N
Sbjct  3    NIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEVLPGDKWSIDLKSFTRTQPLN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGS--QTSSLTLGNYLPTISSS  119
            T+A+ R+REYYD+++VP +LLW  A  V++QM  N QHA S   +++  L   +P ++  
Sbjct  63   TAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHATSYIPSANQALAGVMPNVTCK  122

Query  120  QLSA--------VCSRLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDT  171
             ++         V +    +KNYFGY RS  + KL++YL  GN       F T   + + 
Sbjct  123  GIADYLNLVAPDVTTTNSYEKNYFGYSRSLGTAKLLEYLGYGN-------FYTYATSKNN  175

Query  172  SYTQA-YRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpv  230
            ++T++    NL L+++  LAY+K   D+ R SQW+  SP  +N+DY +G      +   +
Sbjct  176  TWTKSPLSSNLQLNIYGVLAYQKIYADHIRDSQWEKVSPSCFNVDYLSGTVDSAMTIDSM  235

Query  231  ssD----PYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWa  286
             +     P++N   +FDL YCNW KD+F GV P  Q+GD A + +   +  S+       
Sbjct  236  ITGQGFAPFYN---MFDLRYCNWQKDLFHGVLPRQQYGDTAAVNVNLSNVLSA-------  285

Query  287  sgspsskapvvvgaaasspNFTIRAESGN---MNPANILGVDTSSLSLAGSFDVLALRRG  343
                                + ++   G+    +P +  GV+  +++ +G+F VLALR+ 
Sbjct  286  -------------------QYMVQTPDGDPVGGSPFSSTGVNLQTVNGSGTFTVLALRQA  326

Query  344  EALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSG  403
            E LQ+WKEI+ +  ++Y+ QI+ H+ V VGE  S MS Y+GG ++SLDI+EVVN N+   
Sbjct  327  EFLQKWKEITQSGNKDYKDQIEKHWNVSVGEAYSEMSLYLGGTTASLDINEVVNNNITGS  386

Query  404  DVASEAVIAGKGVGSSQGSEKFEARD-WGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDL  462
            + A    IAGKGV    G   F+A + +G++MCIYH++PLLDY +   +P F    +TD 
Sbjct  387  NAAD---IAGKGVVVGNGRISFDAGERYGLIMCIYHSLPLLDYTTDLVNPAFTKINSTDF  443

Query  463  PIPELDSIGMQSVPVSMYSNSDKELVTGFSSADFTMGYLPRYYSWKTSYDYVLGAFTTTE  522
             IPE D +GM+SVP+    N    L + ++     +GY PRY S+KT  D  +GAF TT 
Sbjct  444  AIPEFDRVGMESVPLVSLMN---PLQSSYNVGSSILGYAPRYISYKTDVDSSVGAFKTTL  500

Query  523  KEWVAPI--TSVIWK-RMLIGLTSSSGSF-NYNFFKVNPSILDSIFQANANSKWDTDPFL  578
            K WV      SVI +        +S G+  NY  FKVNP+ +D +F   A++  DTD FL
Sbjct  501  KSWVMSYDNQSVINQLNYQDDPNNSPGTLVNYTNFKVNPNCVDPLFAVAASNSIDTDQFL  560

Query  579  INCAFDVKVVRNLDYSGMPY  598
             +  FDVKVVRNLD  G+PY
Sbjct  561  CSSFFDVKVVRNLDTDGLPY  580


>gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
 gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
Length=573

 Score =   390 bits (1001),  Expect = 3e-124, Method: Compositional matrix adjust.
 Identities = 241/613 (39%), Positives = 342/613 (56%), Gaps = 58/613 (9%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            S+ SL  ++N  +R+ FDLS K AF+AK GELLPI      PGDKF ++ Q FTRTQPVN
Sbjct  3    SVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQPVN  62

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLGNYLPTISSSQL  121
            ++AY+R+REYYD+++VP  LLW  AP   + M  +  HA    SS+ L    P  +   +
Sbjct  63   SAAYSRLREYYDFYFVPYRLLWNMAPTFFTNM-PDPHHAADLVSSVNLSQRHPWFTFFDI  121

Query  122  SAVCSRLFG--------KKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSY  173
                  L          +KN+FG+ R +LS KL+ YL  G       ++ +    SD+  
Sbjct  122  MEYLGNLNSLSGAYEKYQKNFFGFSRVELSVKLLNYLNYG----FGKDYESVKVPSDSD-  176

Query  174  TQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssD  233
                  ++ LS FP LAY+K C+DYFR  QWQ ++PY +N+DY  G SS     +   ++
Sbjct  177  ------DIVLSPFPLLAYQKICEDYFRDDQWQSAAPYRYNLDYLYGKSSGFHIPMSSFTN  230

Query  234  PYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVAT----IGITSDSPESSLQLKAWasgs  289
              + N T+FDL YCN+ KD F G+ P  Q+GDV+      G       SSL         
Sbjct  231  DAFKNPTMFDLNYCNFQKDYFTGMLPRAQYGDVSVASPIFGDLDIGDSSSLTFA------  284

Query  290  psskapvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRW  349
                              +   +  N   + +L V+ +S + AG   VLALR+ E LQ+W
Sbjct  285  ------------------SAPQQGANTIQSGVLVVNNNSNTTAG-LSVLALRQAECLQKW  325

Query  350  KEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASEA  409
            +EI+ +   +Y+ Q++ HF V     +SG   Y+GG +S+LDISEVVNTNL +GD  ++A
Sbjct  326  REIAQSGKMDYQTQMQKHFNVSPSATLSGHCKYLGGWTSNLDISEVVNTNL-TGD--NQA  382

Query  410  VIAGKGVGSSQGSE-KFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPELD  468
             I GKG G+  G++  FE+ + G++MCIYH +PLLD+  +    Q F T  TD  IPE D
Sbjct  383  DIQGKGTGTLNGNKVDFESSEHGIIMCIYHCLPLLDWSINRIARQNFKTTFTDYAIPEFD  442

Query  469  SIGMQSVPVSMYSNSDKELVTGFSSADFTMGYLPRYYSWKTSYDYVLGAFTTTEKEWVAP  528
            S+GMQ +  S      ++L +  SS +  MGY+PRY   KTS D + G+F  T   WV+P
Sbjct  443  SVGMQQLYPSEMIFGLEDLPSDPSSIN--MGYVPRYADLKTSIDEIHGSFIDTLVSWVSP  500

Query  529  ITS---VIWKRMLIGLTSSSGSFNYNFFKVNPSILDSIFQANANSKWDTDPFLINCAFDV  585
            +T      +++       S  +  YNFFKVNP I+D+IF   A+S  +TD  LIN  FD+
Sbjct  501  LTDSYISAYRQACKDAGFSDITMTYNFFKVNPHIVDNIFGVKADSTINTDQLLINSYFDI  560

Query  586  KVVRNLDYSGMPY  598
            K VRN DY+G+PY
Sbjct  561  KAVRNFDYNGLPY  573


>gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium]
Length=615

 Score =   387 bits (995),  Expect = 1e-122, Method: Compositional matrix adjust.
 Identities = 241/638 (38%), Positives = 349/638 (55%), Gaps = 68/638 (11%)

Query  5    SLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSA  64
            S+ DI+N P R+ FDLS K  F+AK+GELLP+     +PGD F +  + FTRTQP+NTSA
Sbjct  2    SMADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTSA  61

Query  65   YTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQT--SSLTLGNYLPTISSSQLS  122
            + R+REYYD+++VP   +W      I+QM +NVQHA   T   +  L   +P  +S Q++
Sbjct  62   FARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFTSEQIA  121

Query  123  AVCS--RLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRFN  180
               +      +KN FG++RS L+ KL+QYL  G+       + +    ++T   +   +N
Sbjct  122  DYLNDQATAARKNPFGFNRSTLTCKLLQYLGYGD-------YNSFDSETNTWSAKPLLYN  174

Query  181  LDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNNT  240
            L+LS FP LAY+K   D++RY+QW+ ++P  +N+DY  G S        + SD    +N 
Sbjct  175  LELSPFPLLAYQKIYSDFYRYTQWEKTNPSTFNLDYIKGTSDLQMDLTGLPSD----DNN  230

Query  241  LFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSP-ESSLQLKAWasgspsskapvvvg  299
             FD+ YCN+ KDMF GV P      VA  G  S  P    L + +     P  K      
Sbjct  231  FFDIRYCNYQKDMFHGVLP------VAQYGSASVVPINGQLNVISNGDSGPIFKTSTPDP  284

Query  300  aaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFD-----------------------  336
                +   T+    G  N +   GV  S+L++  S D                       
Sbjct  285  GTPGTSYVTVGGNIGVDNRS--FGVSGSTLNVGKSADPSGYGFPSNASTRSLLWENPNLI  342

Query  337  ----------VLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGD  386
                      +LALR+ E LQ+WKE+S++  ++Y++QI+ H+G+ V + +S  + Y+GG 
Sbjct  343  IENNQGFYVPILALRQAEFLQKWKEVSVSGEEDYKSQIEKHWGIKVSDFLSHQARYLGGC  402

Query  387  SSSLDISEVVNTNLQSGDVASEAVIAGKGVGSSQGSEKFEAR-DWGVLMCIYHNVPLLDY  445
            ++SLDI+EV+N N+ +GD A++  IAGKG  +  GS +FE++ ++G++MCIYH +P++DY
Sbjct  403  ATSLDINEVINNNI-TGDNAAD--IAGKGTFTGNGSIRFESKGEYGIIMCIYHVLPIVDY  459

Query  446  VSSAPDPQFFVTQNTDLPIPELDSIGMQSVPVSMYSNSDKELVTGFSSADFTMGYLPRYY  505
            V S  D    +   T  PIPELD IGM+SVP+    N  KE  T   SAD  +GY PRY 
Sbjct  460  VGSGVDHSCTLVDATSFPIPELDQIGMESVPLVRAMNPVKESDT--PSADTFLGYAPRYI  517

Query  506  SWKTSYDYVLGAFTTTEKEWVAPI-----TSVIWKRMLIGLTSSSGSFNYNFFKVNPSIL  560
             WKTS D  +G F  + + W  P+     TS               S    FFKVNPSI+
Sbjct  518  DWKTSVDRSVGDFADSLRTWCLPVGDKELTSANSLNFPSNPNVEPDSIAAGFFKVNPSIV  577

Query  561  DSIFQANANSKWDTDPFLINCAFDVKVVRNLDYSGMPY  598
            D +F   A+S   TD FL +  FDVKVVRNLD +G+PY
Sbjct  578  DPLFAVVADSTVKTDEFLCSSFFDVKVVRNLDVNGLPY  615


>gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius]
 gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 
17135]
Length=613

 Score =   347 bits (890),  Expect = 3e-107, Method: Compositional matrix adjust.
 Identities = 219/622 (35%), Positives = 335/622 (54%), Gaps = 43/622 (7%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++ S+K +RN P R+ +DL+ K+ F+AK+G L+P+ W   +P D      + F RTQP+N
Sbjct  10   NIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQPLN  69

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQT--SSLTLGNYLPTISSS  119
            T+A+ R+R Y+D+++VP   +W   P  I+QM++N+ HA       ++ L + LP  ++ 
Sbjct  70   TAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNVPLSDELPYFTAE  129

Query  120  QLSAVCSRLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRF  179
            Q++     L   KN FGY R+ L   +++YL  G+     V       A  T  T+    
Sbjct  130  QVADYIVSLADSKNQFGYYRAWLVCIILEYLGYGDFYPYIVEAAGGEGA--TWATRPMLN  187

Query  180  NLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNN  239
            NL  S FP  AY+K   D+ RY+QW+ S+P  +NIDY +G  S     L  + + + ++ 
Sbjct  188  NLKFSPFPLFAYQKIYADFNRYTQWERSNPSTFNIDYISG--SADSLQLDFTVEGFKDSF  245

Query  240  TLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsskapvvvg  299
             LFD+ Y NW +D+  G  P  Q+G+ + + ++         ++     +P +      G
Sbjct  246  NLFDMRYSNWQRDLLHGTIPQAQYGEASAVPVSG-------SMQVVEGPTPPAFTTGQDG  298

Query  300  aaasspNFTIRAESGNMNPANILGVD--------TSSLSLAG--SFDV--LALRRGEALQ  347
             A  + N TI+  SG +     +G           S L + G  SF V  LALRR EA Q
Sbjct  299  VAFLNGNVTIQGSSGYLQAQTSVGESRILRFNNTNSGLIVEGDSSFGVSILALRRAEAAQ  358

Query  348  RWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVAS  407
            +WKE++L   ++Y +QI+AH+G  V +  S M  ++G  +  L I+EVVN N+ +G+ A+
Sbjct  359  KWKEVALASEEDYPSQIEAHWGQSVNKAYSDMCQWLGSINIDLSINEVVNNNI-TGENAA  417

Query  408  EAVIAGKGVGSSQGSEKFE-ARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPE  466
            +  IAGKG  S  GS  F     +G++MC++H +P LDY++SAP     +T   D PIPE
Sbjct  418  D--IAGKGTMSGNGSINFNVGGQYGIVMCVFHVLPQLDYITSAPHFGTTLTNVLDFPIPE  475

Query  467  LDSIGMQSVPVSMYSNSDKELVTGFS-SADFTMGYLPRYYSWKTSYDYVLGAFTTTEKEW  525
             D IGM+ VPV    N  K     F  S +   GY P+YY+WKT+ D  +G F  + K W
Sbjct  476  FDKIGMEQVPVIRGLNPVKPKDGDFKVSPNLYFGYAPQYYNWKTTLDKSMGEFRRSLKTW  535

Query  526  VAPITSVIWKRMLIGLTS---------SSGSFNYNFFKVNPSILDSIFQANANSKWDTDP  576
            + P         L+   S          + S    FFKV+PS+LD++F   ANS  +TD 
Sbjct  536  IIPFDD----EALLAADSVDFPDNPNVEADSVKAGFFKVSPSVLDNLFAVKANSDLNTDQ  591

Query  577  FLINCAFDVKVVRNLDYSGMPY  598
            FL +  FDV VVR+LD +G+PY
Sbjct  592  FLCSTLFDVNVVRSLDPNGLPY  613


>gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium]
Length=642

 Score =   235 bits (600),  Expect = 7e-65, Method: Compositional matrix adjust.
 Identities = 197/649 (30%), Positives = 294/649 (45%), Gaps = 66/649 (10%)

Query  2    SLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVN  61
            ++  L  ++N P R++FDLS +  F+AK GELLP       PGD   +   +FTRT P+ 
Sbjct  6    NIMGLHGLKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPLQ  65

Query  62   TSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAG-SQTSSLTLGNY-----LPT  115
            ++A+TR+RE   +F+VP   LW++    +  M  N      S+ +S  +GN      +P 
Sbjct  66   SNAFTRLRENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVTTQMPC  125

Query  116  ISSSQLSAVCSRLFGKKNYFGYD------------RSDLSYKLMQYLRVGNSGQVSVNFG  163
            ++   L A   + F  ++  G D            R   S KL+Q L  GN  +   NF 
Sbjct  126  VNYKTLHAYLLK-FINRSTVGSDGSVGPEFNRGCYRHAESAKLLQLLGYGNFPEQFANFK  184

Query  164  TSLPASDTSYTQ----AYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTG  219
             +    + S        Y  +  LS+F  LAY K C D++ Y QWQ  +  L N+DY T 
Sbjct  185  VNNDKHNQSGQNFKDVTYNNSPYLSIFRLLAYHKICNDHYLYRQWQPYNASLCNVDYLTP  244

Query  220  vsshlfsslpvssDP-----YWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSD  274
             SS L S                   L D+ + N   D F GV P +QFG  + + +   
Sbjct  245  NSSSLLSIDDALLSIPDDSIKAEKLNLLDMRFSNLPLDYFTGVLPTSQFGSESVVNLNLG  304

Query  275  SPESSLQLKAWasgspsskapvvvgaaasspNFTIRAESGNMNPANILGVDTS-------  327
            +   S  L    S                       + +GN+   N  G   S       
Sbjct  305  NASGSAVLNGTTSKDSGRWRTTTGEWEMEQR--VASSANGNLKLDNSNGTFISHDHTFSG  362

Query  328  ----SLSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYV  383
                + SL+G+  ++ALR   A Q++KEI L    ++++Q++AHFG+   E     S ++
Sbjct  363  NVAINTSLSGNLSIIALRNALAAQKYKEIQLANDVDFQSQVEAHFGIKPDEKNEN-SLFI  421

Query  384  GGDSSSLDISEVVNTNLQSGDVASEAVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLL  443
            GG SS ++I+E +N NL SGD  +    A +G GS+  S KF A+ +GV++ IY   P+L
Sbjct  422  GGSSSMININEQINQNL-SGDNKATYGAAPQGNGSA--SIKFTAKTYGVVIGIYRCTPVL  478

Query  444  DYVSSAPDPQFFVTQNTDLPIPELDSIGMQS------VPVSMYSNSDKELVTG-FSSADF  496
            D+     D   F T  +D  IPE+DSIGMQ          + Y++  K    G  SS D 
Sbjct  479  DFAHLGIDRTLFKTDASDFVIPEMDSIGMQQTFRCEVAAPAPYNDEFKAFRVGDGSSPDM  538

Query  497  --TMGYLPRYYSWKTSYDYVLGAFTTTEKEWVA-----PITSVIWKRMLIGLTSSSGSFN  549
              T GY PRY  +KTSYD   GAF  + K WV       I + +W     G+ +      
Sbjct  539  SETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGINFDAIQNNVWN-TWAGINAP-----  592

Query  550  YNFFKVNPSILDSIFQANANSKWDTDPFLINCAFDVKVVRNLDYSGMPY  598
             N F   P I+ ++F  ++ +  D D   +         RNL   G+PY
Sbjct  593  -NMFACRPDIVKNLFLVSSTNNSDDDQLYVGMVNMCYATRNLSRYGLPY  640


>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
 gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 
317 str. F0108]
Length=541

 Score =   198 bits (504),  Expect = 2e-52, Method: Compositional matrix adjust.
 Identities = 184/616 (30%), Positives = 269/616 (44%), Gaps = 96/616 (16%)

Query  1    MSLFSLKDIR----NHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTR  56
            MSL  +  I+    N PR SAFDLS K  ++A +G LLP+     M  D   ++ Q F R
Sbjct  1    MSLKKVPQIKPSRANRPR-SAFDLSQKHLYTAPAGALLPVLSVDLMFHDHIRIQAQDFMR  59

Query  57   TQPVNTSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLGNYLPTI  116
            T P+N++A+  +R  Y++F+VP   LW    + I+ M     +  S  SS      L ++
Sbjct  60   TMPMNSAAFISMRGVYEFFFVPYSQLWHPYDQFITSMN---DYRSSVVSSAAGDKALDSV  116

Query  117  SSSQLSAVCS--RLFGKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYT  174
             + +L+ +    R    K+ FGY  S+ S +LM  L           +G  + +S T   
Sbjct  117  PNVKLADMYKFVRERTDKDIFGYPHSNNSCRLMDLL----------GYGKPITSSKTPVP  166

Query  175  QAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDP  234
              Y  N++  LF  LAY K   DY+R + ++    Y +NID+  G               
Sbjct  167  LLYTGNVN--LFRLLAYNKIYSDYYRNTTYEGVDVYSFNIDHKKGTFVPTADEF------  218

Query  235  YWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsska  294
                    +L Y N   D +  + P   F    TIG  SDS  S LQL            
Sbjct  219  ----KKYLNLHYRNAPLDFYTNLRPTPLF----TIG--SDSFSSVLQLS-----------  257

Query  295  pvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRWKEISL  354
                     S  F+    S  +N A+               +V A+R   AL +   IS+
Sbjct  258  -----DPTGSAGFSADGNSAKLNMAS-----------PDVLNVSAIRSAFALDKLLSISM  301

Query  355  NVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASE------  408
               + Y  QI+AHFGV V E   G   Y+GG  S++ + +V  T+  +    SE      
Sbjct  302  RAGKTYAEQIEAHFGVTVSEGRDGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKL  361

Query  409  ----AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNT--DL  462
                  I GKG GS  G  +F+A++ GVLMCIY  VP + Y     DP  FV + T  D 
Sbjct  362  AGYLGKITGKGTGSGYGEIQFDAKEPGVLMCIYSVVPAMQYDCMRLDP--FVAKQTRGDY  419

Query  463  PIPELDSIGMQS-VPVSMYSNSDKELVTGFSSADFTMGYLPRYYSWKTSYDYVLGAFTTT  521
             IPE +++GMQ  VP  +  N  K         D + G+ PRY  +KT++D   G F   
Sbjct  420  FIPEFENLGMQPIVPAFVSLNRAK---------DNSYGWQPRYSEYKTAFDINHGQFANG  470

Query  522  EKEWVAPITSVIWKRMLIGLTSSSGSFNYNFFKVNPSILDSIFQANANSKWDTDPFLINC  581
            E     P++   W       + +  +FN    K+NP  LDS+F  N N    TD      
Sbjct  471  E-----PLS--YWSIARARGSDTLNTFNVAALKINPHWLDSVFAVNYNGTEVTDCMFGYA  523

Query  582  AFDVKVVRNLDYSGMP  597
             F+++ V ++   GMP
Sbjct  524  HFNIEKVSDMTEDGMP  539


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score =   194 bits (494),  Expect = 7e-51, Method: Compositional matrix adjust.
 Identities = 171/620 (28%), Positives = 269/620 (43%), Gaps = 94/620 (15%)

Query  1    MSLFSLKDIRNHPRRSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPV  60
            +S+  +K  R +  R+AFDLS +  F+A +G LLP+     +P D   +  Q F RT P+
Sbjct  3    VSIPKIKATRPNRNRNAFDLSQRHLFTAHAGMLLPVLNLDLIPHDHVEINAQDFMRTLPM  62

Query  61   NTSAYTRIREYYDWFWVPLHLLWRHAPEVISQMQSNVQHAGSQTSSLTLGNYLPTISSSQ  120
            NT+A+  +R  Y++F+VP H LW    + I+ M  N  H+ S   S+  G     +    
Sbjct  63   NTAAFASMRGVYEFFFVPYHQLWAQFDQFITGM--NDFHS-SANKSIQGGTSPLQVPYFN  119

Query  121  LSAVCSRLFGKKNYFGYDRSDLSYKL----MQYLRVGNSGQVSVNFGTSLPASDTSYTQA  176
            + +V + L   K        DL YK      + L +   G+   +FGT+ P +       
Sbjct  120  VDSVFNSLNTGKESGSGSTDDLQYKFKYGAFRLLDLLGYGRKFDSFGTAYPDN----VSG  175

Query  177  YRFNLD--LSLFPFLAYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDP  234
             + NLD   S+F  LAY K  QDY+R S +++     +N D F G               
Sbjct  176  LKNNLDYNCSVFRILAYNKIYQDYYRNSNYENFDTDSFNFDKFKGGLVDAKVVA------  229

Query  235  YWNNNTLFDLEYCNWNKDMFMGVFPD------TQFGDVATIGITSDSPESSLQLKAWasg  288
                  LF L Y N   D F  +         T F DV  I I   +P            
Sbjct  230  -----DLFKLRYRNAQTDYFTNLRQSQLFSFTTAFEDVDNINI---APRD----------  271

Query  289  spsskapvvvgaaasspNFTIRAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQR  348
                                ++++  N    N  GVDT S    G F V +LR   A+ +
Sbjct  272  -------------------YVKSDGSNFTRVN-FGVDTDSSE--GDFSVSSLRAAFAVDK  309

Query  349  WKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSSSLDISEVVNTNLQSGDVASE  408
               +++   + ++ Q++AH+GV++ ++  G   Y+GG  S + +S+V  T   SG  A+E
Sbjct  310  LLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDSDMQVSDVTQT---SGTTATE  366

Query  409  --------AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNT  460
                      +AGKG GS +G   F+A++ GVLMCIY  VP + Y  +  DP        
Sbjct  367  YKPEAGYLGRVAGKGTGSGRGRIVFDAKEHGVLMCIYSLVPQIQYDCTRLDPMVDKLDRF  426

Query  461  DLPIPELDSIGMQSVPVSMYSNSDKELVTGFSSAD---FTMGYLPRYYSWKTSYDYVLGA  517
            D   PE +++GMQ +        +   ++ F + D     +GY PRY  +KT+ D   G 
Sbjct  427  DYFTPEFENLGMQPL--------NSSYISSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQ  478

Query  518  FTTTEKEWVAPITSVIWKRMLIGLTSSSGSFNYNFFKVNPSILDSIFQANANSKWDTDPF  577
            F  ++      ++S  W        ++        FK++P  L+SIF  + N     D  
Sbjct  479  FAQSDA-----LSS--WSVSRFRRWTTFPQLEIADFKIDPGCLNSIFPVDYNGTEANDCV  531

Query  578  LINCAFDVKVVRNLDYSGMP  597
               C F++  V ++   GMP
Sbjct  532  YGGCNFNIVKVSDMSVDGMP  551


>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
 gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=519

 Score =   164 bits (415),  Expect = 2e-40, Method: Compositional matrix adjust.
 Identities = 151/580 (26%), Positives = 246/580 (42%), Gaps = 79/580 (14%)

Query  33   LLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSAYTRIREYYDWFWVPLHLLWRHAPEVISQ  92
            LLP+     +P D   +  Q F RT P+NT+A+  +R  Y++F+VP H LW    + I+ 
Sbjct  2    LLPVLNLDLIPHDHVEINAQDFMRTLPMNTAAFASMRGVYEFFFVPYHQLWAQFDQFITG  61

Query  93   MQSNVQHAGSQTSSLTLGNYLPTISSSQLSAVCSRLFGKKNYFGYDRSDLSYKL----MQ  148
            M  N  H+ S   S+  G     +    L +V   +  + +   + + DL Y+      +
Sbjct  62   M--NDFHS-SANKSIQGGTSPLQVPYFNLESVFKNIIERDSTPSF-QDDLQYRFKYGAFR  117

Query  149  YLRVGNSGQVSVNFGTSLPASDTSYTQAYRFNLDLSLFPFLAYKKFCQDYFRYSQWQDSS  208
             L +   G+   +FGT+ P + +       +N   S+F  LAY K  QDY+R S +++  
Sbjct  118  LLDLLGYGRKFDSFGTAYPDNVSGLKNNLDYNC--SVFRVLAYNKIYQDYYRNSNYENFD  175

Query  209  PYLWNIDYFTGvsshlfsslpvssDPYWNNNTLFDLEYCNWNKDMFMGVFPDTQFGDVAT  268
               +N D F G                     LF L Y N   D F  +     F  +  
Sbjct  176  TDSFNFDKFKGGLVDAKVVA-----------DLFKLRYRNAQTDYFTNLRQSQLFTFIPE  224

Query  269  IGITSDSPESSLQLKAWasgspsskapvvvgaaasspNFTIRAESGNMNPANILGVDTSS  328
                SD    +     +A  S S+   +         NF +  ++               
Sbjct  225  F---SDDEHLNFDRDQYADQSKSNFTQL---------NFPVDVDNN--------------  258

Query  329  LSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFGVDVGENMSGMSTYVGGDSS  388
                G F V +LR   A+ +   +++   + ++ Q++AH+GV++ ++  G   Y+GG  S
Sbjct  259  ---LGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDS  315

Query  389  SLDISEVVNTNLQSGDVASE--------AVIAGKGVGSSQGSEKFEARDWGVLMCIYHNV  440
             L +S+V  T   SG  A+E          IAGKG GS +G   F+A++ GVLMCIY  V
Sbjct  316  DLQVSDVTQT---SGTTATEYKPEAGYLGRIAGKGTGSGRGRIVFDAKEHGVLMCIYSLV  372

Query  441  PLLDYVSSAPDPQFFVTQNTDLPIPELDSIGMQSVPVSMYSNSDKELVTGFSSAD---FT  497
            P + Y  +  DP        D   PE +++GMQ +        +   ++ F + D     
Sbjct  373  PQIQYDCTRLDPMVDKLDRFDFFTPEFENLGMQPL--------NSSYISSFCTPDPKNPV  424

Query  498  MGYLPRYYSWKTSYDYVLGAFTTTEKEWVAPITSVIWKRMLIGLTSSSGSFNYNFFKVNP  557
            +GY PRY  +KT+ D   G F   +      ++S  W        ++        FK++P
Sbjct  425  LGYQPRYSEYKTALDINHGQFAQNDA-----LSS--WSVSRFRRWTTFPQLEIADFKIDP  477

Query  558  SILDSIFQANANSKWDTDPFLINCAFDVKVVRNLDYSGMP  597
              L+S+F    N    TD     C F++  V ++   GMP
Sbjct  478  GCLNSVFPVEFNGTESTDCVFGGCNFNIVKVSDMSVDGMP  517


>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568

 Score =   160 bits (405),  Expect = 4e-39, Method: Compositional matrix adjust.
 Identities = 150/600 (25%), Positives = 259/600 (43%), Gaps = 68/600 (11%)

Query  15   RSAFDLSSKVAFSAKSGELLPIKWYFTMPGDKFTLKRQHFTRTQPVNTSAYTRIREYYDW  74
            R+AFD+S +  F+A +G LLP+     +P D   +    F RT P+N++A+  +R  Y++
Sbjct  18   RNAFDISQRHLFTAPAGALLPVLSLDLLPHDHVEINASDFMRTLPMNSAAFMSMRGVYEF  77

Query  75   FWVPLHLLWRHAPEVISQM---QSNVQHAGSQTSSLTLGNYLPTISSSQLSAVC--SRLF  129
            ++VP   LW    + I+ M   +S+  +A         G   P+  S  +  +    +  
Sbjct  78   YFVPYKQLWSGFDQFITGMSDYKSSFMYAFK-------GKTPPSCVSFDVQKLVDWCKTN  130

Query  130  GKKNYFGYDRSDLSYKLMQYLRVGNSGQVSVNFGTSLPASDTSYTQAYRFNLDLSLFPFL  189
              K+  G+D++   Y+++  L  G     +      +P ++ + T   +     + F  L
Sbjct  131  TAKDIHGFDKNKGVYRILDLLGYGKYANSA-----GVPYTNPTSTTMGK----CTPFRGL  181

Query  190  AYKKFCQDYFRYSQWQDSSPYLWNIDYFTGvsshlfsslpvssDPYWNNNTLFDLEYCNW  249
            AY+K   D++R + +++     +N+D F G      +      D  W     F L Y N 
Sbjct  182  AYQKIYNDFYRNTTYEEYQLESFNVDMFYGSGKVKETIPNEPWDYDW-----FTLRYRNA  236

Query  250  NKDMFMGVFPDTQFGDVATIGITSDSPESSLQLKAWasgspsskapvvvgaaasspNFTI  309
             KD+   V P   F       I   +P+        +         V  G      +  I
Sbjct  237  QKDLLTNVRPTPLFS------IDDFNPQF---FTGGSDIVMEKGPNVTGGTHEYRDSVVI  287

Query  310  RAESGNMNPANILGVDTSSLSLAGSFDVLALRRGEALQRWKEISLNVPQNYRAQIKAHFG  369
              ++   N     GVD+    ++    V  +R   AL++   +++   + Y+ Q++AHFG
Sbjct  288  VGKNLKEN-----GVDSKRTMIS----VADIRNAFALEKLASVTMRAGKTYKEQMEAHFG  338

Query  370  VDVGENMSGMSTYVGGDSSSLDISEVVN----TNLQSGDVASEAVIA---GKGVGSSQGS  422
            + V E   G  TY+GG  S++ + +V      T   + D +    +    GK  GS  G 
Sbjct  339  ISVEEGRDGRCTYIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGH  398

Query  423  EKFEARDWGVLMCIYHNVPLLDYVSSAPDPQFFVTQNTDLPIPELDSIGMQ-----SVPV  477
             +F+A++ G+LMCIY  VP + Y S   DP     +  D  +PE +++GMQ     ++  
Sbjct  399  IRFDAKEHGILMCIYSLVPDVQYDSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISY  458

Query  478  SMYSNSDKELVTGFSSADFTMGYLPRYYSWKTSYDYVLGAFTTTEKEWVAPITSVIWKRM  537
               +N+    +    +     G+ PRY  +KT+ D   G F   E     P++     R 
Sbjct  459  KYNNNTANSRIKNLGA----FGWQPRYSEYKTALDINHGQFVHQE-----PLSYWTVAR-  508

Query  538  LIGLTSSSGSFNYNFFKVNPSILDSIFQANANSKWDTDPFLINCAFDVKVVRNLDYSGMP  597
                  S  +FN + FK+NP  LD +F  N N    TD     C F++  V ++   GMP
Sbjct  509  --ARGESMSNFNISTFKINPKWLDDVFAVNYNGTELTDQVFGGCYFNIVKVSDMSIDGMP  566



Lambda      K        H        a         alpha
   0.319    0.133    0.410    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 4446915032268