bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-32_CDS_annotation_glimmer3.pl_2_4 Length=647 Score E Sequences producing significant alignments: (Bits) Value gi|490418709|ref|WP_004291032.1| hypothetical protein 356 2e-110 gi|575094354|emb|CDL65742.1| unnamed protein product 350 9e-108 gi|496050829|ref|WP_008775336.1| hypothetical protein 347 7e-107 gi|494822885|ref|WP_007558293.1| hypothetical protein 327 4e-99 gi|575094321|emb|CDL65708.1| unnamed protein product 247 1e-68 gi|547226430|ref|WP_021963493.1| putative uncharacterized protein 241 3e-67 gi|494308783|ref|WP_007173938.1| hypothetical protein 154 7e-37 gi|575094339|emb|CDL65730.1| unnamed protein product 152 4e-36 gi|575094297|emb|CDL65693.1| unnamed protein product 142 2e-32 gi|496521299|ref|WP_009229582.1| capsid protein 120 2e-25 >gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii] gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 20697] Length=578 Score = 356 bits (913), Expect = 2e-110, Method: Compositional matrix adjust. Identities = 242/656 (37%), Positives = 334/656 (51%), Gaps = 90/656 (14%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 +N+ S +RN P R+GFDLS + FT+KAGELLPV K V PGD F+I + FTRTQPV Sbjct 2 ANIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQPV 61 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASGI--VSNKIVTSDIPWTLL 120 NTAA+ RIREY D++FVP L+ L M DNP A I N +++ ++P+ Sbjct 62 NTAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDNPQHAVSIDPTRNFVLSGEMPYM-- 119 Query 121 GHDSKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSAS 180 IA+ + + +SA D N+FG+N KL L YGN+ S Sbjct 120 ----TSEAIASYINAL--STASALADYKSNYFGYNRSKSSVKLLEYLGYGNYES------ 167 Query 181 TPGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYS 240 D++ ++ NI YQKIY+D +R SQWE+ P T+N D+ Sbjct 168 --------FLTDDWNTAPLMANLNHNIFGLLAYQKIYSDFYRDSQWERVSPSTFNVDYLD 219 Query 241 GGNVFNVFALGAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDITTDS 300 G ++ L A ST + + N F LRY NW KDLF GV+P+ Q G+ +V IT D Sbjct 220 GSSM----NLDNAYST--EFYQNYNFFDLRYCNWQKDLFHGVLPHQQYGETAVASITPD- 272 Query 301 ATRSYPVSLFSSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEKWDSSIKAKT 360 T +S FS+ G + G A + P A DT Sbjct 273 VTGKLTLSNFSTV-GTSPTTASGTATKNLP---AFDT----------------------- 305 Query 361 WvgvnnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQIYAHFGV 420 S+L LR AE +QK++E++Q ++ +DQ+ H+GV Sbjct 306 ---------------------VGDLSILVLRQAEFLQKWKEITQSGNKDYKDQLEKHWGV 344 Query 421 SLSPALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvgtgqggTSFSSD-EYGV 479 S+ S+ C +GG +S+IDI+EV+N N+ G ADI GKGVG G +F+S+ YG+ Sbjct 345 SVGDGFSELCTYLGGVSSSIDINEVINTNITGSAAADIAGKGVGVANGEINFNSNGRYGL 404 Query 480 LMAIYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSLHFGRFFNYKSDKFTFDP 539 +M IYH +PLLDY L N++D PEFD +GMQS+ + N + + Sbjct 405 IMCIYHCLPLLDYTTDMLDPAFLKVNSTDYAIPEFDRVGMQSMPLVQLMN--PLRSFANA 462 Query 540 TASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSKWIDSTVSAGQTYYS--- 596 + V+GYVPR+ID KT D+ G F+ TL SWV + K + A S Sbjct 463 SGLVLGYVPRYIDYKTSVDQSVGGFKRTLNSWVISYGNISVLKQVTLPNDAPPIEPSEPV 522 Query 597 -----LNYGFFKVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNFDYDGMPY 647 +N+ FFKVNP LD IF V+A +TDQFL S + D+KAVRN D DG+PY Sbjct 523 PSVAPMNFTFFKVNPDCLDPIFAVQAGDDTNTDQFLCSSFFDIKAVRNLDTDGLPY 578 >gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium] Length=615 Score = 350 bits (898), Expect = 9e-108, Method: Compositional matrix adjust. Identities = 238/667 (36%), Positives = 343/667 (51%), Gaps = 77/667 (12%) Query 6 FSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPVNTA 65 S D++N P R+GFDLS + FT+KAGELLPV K+V PGD F I + FTRTQP+NT+ Sbjct 1 MSMADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTS 60 Query 66 AYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASG--IVSNKIVTSDIPWTLLGHD 123 A+ R+REY D+YFVP + + M N ASG + N ++ +P+ Sbjct 61 AFARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFTSEQ- 119 Query 124 SKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSAST-- 181 IA+ L ++A+ +P FGFN TL KL L YG++ S D +T Sbjct 120 -----IADYLNDQA---TAARKNP----FGFNRSTLTCKLLQYLGYGDYNSFDSETNTWS 167 Query 182 PGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYSG 241 L + L S F P YQKIY+D +R++QWEK P T+N D+ G Sbjct 168 AKPLLYNLELSPF--------------PLLAYQKIYSDFYRYTQWEKTNPSTFNLDYIKG 213 Query 242 GNVFNVFALGAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDIT---- 297 + + G + NN F +RY N+ KD+F GV+P +Q G SVV I Sbjct 214 TSDLQMDLTGLPS-------DDNNFFDIRYCNYQKDMFHGVLPVAQYGSASVVPINGQLN 266 Query 298 ------------TDSATRSYPVSLFSSASGPNGINKLGAAIPDAPFNEATDT-PSTYNFS 344 T + P + + + G G++ + + N PS Y F Sbjct 267 VISNGDSGPIFKTSTPDPGTPGTSYVTVGGNIGVDNRSFGVSGSTLNVGKSADPSGYGFP 326 Query 345 IDFGTEKWDSSIKAKTWvgvnnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQ 404 ++S ++ W N + G VP +L LR AE +QK++EVS Sbjct 327 S-------NASTRSLLWENPNLIIENNQGFYVP---------ILALRQAEFLQKWKEVSV 370 Query 405 VADQTVRDQIYAHFGVSLSPALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvg 464 ++ + QI H+G+ +S LS +GG A+++DI+EV+NNN+ G+N ADI GKG Sbjct 371 SGEEDYKSQIEKHWGIKVSDFLSHQARYLGGCATSLDINEVINNNITGDNAADIAGKGTF 430 Query 465 tgqggTSFSSD-EYGVLMAIYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSLH 523 TG G F S EYG++M IYHV+P++DYV +G H + + P PE D IGM+S+ Sbjct 431 TGNGSIRFESKGEYGIIMCIYHVLPIVDYVGSGVDHSCTLVDATSFPIPELDQIGMESVP 490 Query 524 FGRFFNYKSDKFTFDPTA-SVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSK 582 R N + T P+A + +GY PR+ID KT D G F +L++W P+ + L+ Sbjct 491 LVRAMNPVKESDT--PSADTFLGYAPRYIDWKTSVDRSVGDFADSLRTWCLPVGDKELTS 548 Query 583 WIDSTVSAGQTYY--SLNYGFFKVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNF 640 + S+ GFFKVNPS++D +F V ADS++ TD+FL S + DVK VRN Sbjct 549 ANSLNFPSNPNVEPDSIAAGFFKVNPSIVDPLFAVVADSTVKTDEFLCSSFFDVKVVRNL 608 Query 641 DYDGMPY 647 D +G+PY Sbjct 609 DVNGLPY 615 >gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4] gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4] Length=580 Score = 347 bits (889), Expect = 7e-107, Method: Compositional matrix adjust. Identities = 241/660 (37%), Positives = 338/660 (51%), Gaps = 96/660 (15%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 +N+ S +RN R+GFDLS + FT+K GELLPV V PGDK+ I + FTRTQP+ Sbjct 2 ANIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEVLPGDKWSIDLKSFTRTQPL 61 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASGIV--SNKIVTSDIPWTLL 120 NTAA+ R+REY D+YFVP L+ L M DNP A+ + +N+ + +P Sbjct 62 NTAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHATSYIPSANQALAGVMP---- 117 Query 121 GHDSKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSAS 180 + GIA+ L + D + N+FG++ AKL L YGNF Y+ + Sbjct 118 --NVTCKGIADYL-NLVAPDVTTTNSYEKNYFGYSRSLGTAKLLEYLGYGNF----YTYA 170 Query 181 TPGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYS 240 T + ++ S +L+ +NI YQKIYADH R SQWEK P +N D+ S Sbjct 171 TSKNNTWTKSPLSSNLQ-------LNIYGVLAYQKIYADHIRDSQWEKVSPSCFNVDYLS 223 Query 241 GGNVFNVFALGAADS--TLKTYLEGN------NLFTLRYANWPKDLFMGVMPNSQLGDVS 292 G DS T+ + + G N+F LRY NW KDLF GV+P Q G Sbjct 224 G----------TVDSAMTIDSMITGQGFAPFYNMFDLRYCNWQKDLFHGVLPRQQYG--- 270 Query 293 VVDITTDSATRSYPVSLFSSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEKW 352 D + S +S P+G + +PF+ T N S Sbjct 271 --DTAAVNVNLSNVLSAQYMVQTPDG-----DPVGGSPFSSTGVNLQTVNGS-------- 315 Query 353 DSSIKAKTWvgvnnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRD 412 +F+VL LR AE +QK++E++Q ++ +D Sbjct 316 ------------------------------GTFTVLALRQAEFLQKWKEITQSGNKDYKD 345 Query 413 QIYAHFGVSLSPALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvgtgqggTSF 472 QI H+ VS+ A S+ +GG+ +++DI+EVVNNN+ G N ADI GKGV G G SF Sbjct 346 QIEKHWNVSVGEAYSEMSLYLGGTTASLDINEVVNNNITGSNAADIAGKGVVVGNGRISF 405 Query 473 SSDE-YGVLMAIYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSLHFGRFFNYK 531 + E YG++M IYH +PLLDY N++D PEFD +GM+S+ N Sbjct 406 DAGERYGLIMCIYHSLPLLDYTTDLVNPAFTKINSTDFAIPEFDRVGMESVPLVSLMN-- 463 Query 532 SDKFTFDPTASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSKWI----DST 587 + +++ +S++GY PR+I KTD D GAF++TLKSWV D + + + D Sbjct 464 PLQSSYNVGSSILGYAPRYISYKTDVDSSVGAFKTTLKSWVMSYDNQSVINQLNYQDDPN 523 Query 588 VSAGQTYYSLNYGFFKVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNFDYDGMPY 647 S G +NY FKVNP+ +D +F V A +S+DTDQFL S + DVK VRN D DG+PY Sbjct 524 NSPGTL---VNYTNFKVNPNCVDPLFAVAASNSIDTDQFLCSSFFDVKVVRNLDTDGLPY 580 >gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius] gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 17135] Length=613 Score = 327 bits (839), Expect = 4e-99, Method: Compositional matrix adjust. Identities = 230/661 (35%), Positives = 346/661 (52%), Gaps = 72/661 (11%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 +N+ S VRN P R+G+DL+++I FT+KAG L+PV++ V P D + F RTQP+ Sbjct 9 ANIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQPL 68 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASG--IVSNKIVTSDIPWTLL 120 NTAA+ R+R Y D+YFVP R + P A+ M+ N + ASG + N ++ ++P+ Sbjct 69 NTAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNVPLSDELPYFTA 128 Query 121 GHDSKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSAS 180 +A+ + + DS Q FG+ L + L YG+F A+ Sbjct 129 EQ------VADYIVSL--ADSKNQ-------FGYYRAWLVCIILEYLGYGDFYPYIVEAA 173 Query 181 TPGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYS 240 G + + R ++ + P YQKIYAD R++QWE++ P T+N D+ S Sbjct 174 -------GGEGATWATRPMLNNLKFSPFPLFAYQKIYADFNRYTQWERSNPSTFNIDYIS 226 Query 241 GGNVFNVFALGAADS-----TLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVD 295 G +ADS T++ + + NLF +RY+NW +DL G +P +Q G+ S V Sbjct 227 G----------SADSLQLDFTVEGFKDSFNLFDMRYSNWQRDLLHGTIPQAQYGEASAVP 276 Query 296 ITTDSATRSYPVSLFSSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEKWDSS 355 ++ P + +G +G+ L N S Y Sbjct 277 VSGSMQVVEGPTPP-AFTTGQDGVAFLNG-------NVTIQGSSGY-------------- 314 Query 356 IKAKTWvgvnnvgsgvlgvtvPGSQLAASF--SVLQLRMAEAVQKYREVSQVADQTVRDQ 413 ++A+T VG + + + + +SF S+L LR AEA QK++EV+ +++ Q Sbjct 315 LQAQTSVGESRILRFNNTNSGLIVEGDSSFGVSILALRRAEAAQKWKEVALASEEDYPSQ 374 Query 414 IYAHFGVSLSPALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvgtgqggTSFS 473 I AH+G S++ A SD C +G ++ I+EVVNNN+ GEN ADI GKG +G G +F+ Sbjct 375 IEAHWGQSVNKAYSDMCQWLGSINIDLSINEVVNNNITGENAADIAGKGTMSGNGSINFN 434 Query 474 -SDEYGVLMAIYHVVPLLDYVITGQPH-ELLYTNTSDLPFPEFDSIGMQSLHFGRFFN-- 529 +YG++M ++HV+P LDY IT PH TN D P PEFD IGM+ + R N Sbjct 435 VGGQYGIVMCVFHVLPQLDY-ITSAPHFGTTLTNVLDFPIPEFDKIGMEQVPVIRGLNPV 493 Query 530 -YKSDKFTFDPTASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYL--SKWIDS 586 K F P GY P++ + KT D+ G FR +LK+W+ P D E L + +D Sbjct 494 KPKDGDFKVSPNL-YFGYAPQYYNWKTTLDKSMGEFRRSLKTWIIPFDDEALLAADSVDF 552 Query 587 TVSAGQTYYSLNYGFFKVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNFDYDGMP 646 + S+ GFFKV+PSVLD++F VKA+S ++TDQFL S DV VR+ D +G+P Sbjct 553 PDNPNVEADSVKAGFFKVSPSVLDNLFAVKANSDLNTDQFLCSTLFDVNVVRSLDPNGLP 612 Query 647 Y 647 Y Sbjct 613 Y 613 >gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium] Length=642 Score = 247 bits (630), Expect = 1e-68, Method: Compositional matrix adjust. Identities = 191/667 (29%), Positives = 307/667 (46%), Gaps = 53/667 (8%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 SN+ ++N P R+ FDLS R FT+K GELLP + + + PGD ++ FTRT P+ Sbjct 5 SNIMGLHGLKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPL 64 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQ------ASGIVSNKIVTSDIP 116 + A+TR+RE + ++FVP + K ++NM N AS +V N+ VT+ +P Sbjct 65 QSNAFTRLRENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVTTQMP 124 Query 117 WTLLGHDSKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKD 176 + + + + + R G S + P N G AKL +L YGNF + Sbjct 125 --CVNYKTLHAYLLKFINRSTVG-SDGSVGPEFN-RGCYRHAESAKLLQLLGYGNFPEQF 180 Query 177 YSASTPGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNF 236 + D S +F YN+S ++I Y KI DH+ + QW+ N Sbjct 181 ANFKVNND-KHNQSGQNFKDVTYNNSPYLSIFRLLAYHKICNDHYLYRQWQPYNASLCNV 239 Query 237 DWYSGGNVFNVFALGAA-----DSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDV 291 D+ + N ++ ++ A D ++K E NL +R++N P D F GV+P SQ G Sbjct 240 DYLT-PNSSSLLSIDDALLSIPDDSIKA--EKLNLLDMRFSNLPLDYFTGVLPTSQFGSE 296 Query 292 SVVDITTDSATRSYPVSLFSSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEK 351 SVV++ +A+ S L + S +G + + + + + N +D Sbjct 297 SVVNLNLGNASGS--AVLNGTTSKDSG--RWRTTTGEWEMEQRVASSANGNLKLDNSNGT 352 Query 352 WDSSIKAKTWvgvnnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVR 411 + S G + L+ + S++ LR A A QKY+E+ D + Sbjct 353 FIS------------HDHTFSGNVAINTSLSGNLSIIALRNALAAQKYKEIQLANDVDFQ 400 Query 412 DQIYAHFGVSLSPALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvgtgqggTS 471 Q+ AHFG+ ++ F +GGS+S I+I+E +N NL+G+N+A G G Sbjct 401 SQVEAHFGIKPDEKNENSLF-IGGSSSMININEQINQNLSGDNKATYGAAPQGNGSASIK 459 Query 472 FSSDEYGVLMAIYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSL--------- 522 F++ YGV++ IY P+LD+ G L T+ SD PE DSIGMQ Sbjct 460 FTAKTYGVVIGIYRCTPVLDFAHLGIDRTLFKTDASDFVIPEMDSIGMQQTFRCEVAAPA 519 Query 523 -HFGRFFNYKSDKFTFDPTASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLS 581 + F ++ + + GY PR+ + KT YD GAF +LKSWV ++ + + Sbjct 520 PYNDEFKAFRVGDGSSPDMSETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGINFDAIQ 579 Query 582 KWIDSTVSAGQTYYSLNY-GFFKVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNF 640 + T+ +N F P ++ ++F V + ++ D DQ + A RN Sbjct 580 N------NVWNTWAGINAPNMFACRPDIVKNLFLVSSTNNSDDDQLYVGMVNMCYATRNL 633 Query 641 DYDGMPY 647 G+PY Sbjct 634 SRYGLPY 640 >gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185] gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185] Length=573 Score = 241 bits (616), Expect = 3e-67, Method: Compositional matrix adjust. Identities = 130/270 (48%), Positives = 173/270 (64%), Gaps = 2/270 (1%) Query 379 SQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQIYAHFGVSLSPALSDTCFRVGGSAS 438 S A SVL LR AE +QK+RE++Q + Q+ HF VS S LS C +GG S Sbjct 305 SNTTAGLSVLALRQAECLQKWREIAQSGKMDYQTQMQKHFNVSPSATLSGHCKYLGGWTS 364 Query 439 NIDISEVVNNNLAGENQADIMgkgvgtgq-ggTSFSSDEYGVLMAIYHVVPLLDYVITGQ 497 N+DISEVVN NL G+NQADI GKG GT F S E+G++M IYH +PLLD+ I Sbjct 365 NLDISEVVNTNLTGDNQADIQGKGTGTLNGNKVDFESSEHGIIMCIYHCLPLLDWSINRI 424 Query 498 PHELLYTNTSDLPFPEFDSIGMQSLHFGRFFNYKSDKFTFDPTASVMGYVPRFIDLKTDY 557 + T +D PEFDS+GMQ L+ + + DP++ MGYVPR+ DLKT Sbjct 425 ARQNFKTTFTDYAIPEFDSVGMQQLYPSEMI-FGLEDLPSDPSSINMGYVPRYADLKTSI 483 Query 558 DEVYGAFRSTLKSWVAPLDPEYLSKWIDSTVSAGQTYYSLNYGFFKVNPSVLDSIFKVKA 617 DE++G+F TL SWV+PL Y+S + + AG + ++ Y FFKVNP ++D+IF VKA Sbjct 484 DEIHGSFIDTLVSWVSPLTDSYISAYRQACKDAGFSDITMTYNFFKVNPHIVDNIFGVKA 543 Query 618 DSSMDTDQFLSSLYLDVKAVRNFDYDGMPY 647 DS+++TDQ L + Y D+KAVRNFDY+G+PY Sbjct 544 DSTINTDQLLINSYFDIKAVRNFDYNGLPY 573 Score = 176 bits (445), Expect = 6e-44, Method: Compositional matrix adjust. Identities = 115/294 (39%), Positives = 151/294 (51%), Gaps = 31/294 (11%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 S++ S ++N R+GFDLS + FT+K GELLP+ K VYPGDKF IR Q FTRTQPV Sbjct 2 SSVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQPV 61 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASGIVSNKIVTSDIPWTLLGH 122 N+AAY+R+REY D+YFVP RL+ P NM D P A+ +VS+ ++ PW Sbjct 62 NSAAYSRLREYYDFYFVPYRLLWNMAPTFFTNMPD-PHHAADLVSSVNLSQRHPWFTFFD 120 Query 123 DSKPLGIANLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDY-SAST 181 + LG N L Y+ NFFGF+ L KL L YG KDY S Sbjct 121 IMEYLGNLNSLSGAYEKYQK-------NFFGFSRVELSVKLLNYLNYG--FGKDYESVKV 171 Query 182 PGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDW-YS 240 P D + ++ P YQKI D+FR QW+ PY YN D+ Y Sbjct 172 PSD---------------SDDIVLSPFPLLAYQKICEDYFRDDQWQSAAPYRYNLDYLYG 216 Query 241 GGNVFNVFALGAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVV 294 + F++ + K +F L Y N+ KD F G++P +Q GDVSV Sbjct 217 KSSGFHIPMSSFTNDAFKN----PTMFDLNYCNFQKDYFTGMLPRAQYGDVSVA 266 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 154 bits (389), Expect = 7e-37, Method: Compositional matrix adjust. Identities = 168/644 (26%), Positives = 268/644 (42%), Gaps = 111/644 (17%) Query 11 VRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPVNTAAYTRI 70 R + +R+ FDLS+R FT+ AG LLPV + P D +I Q F RT P+NTAA+ + Sbjct 11 TRPNRNRNAFDLSQRHLFTAHAGMLLPVLNLDLIPHDHVEINAQDFMRTLPMNTAAFASM 70 Query 71 REYLDWYFVPLRLINKNLPQALMNMQDNPVQASGIVSNKIVTSDIPWTLLGHDSKPLGIA 130 R +++FVP + Q + M D A+ + +P+ + Sbjct 71 RGVYEFFFVPYHQLWAQFDQFITGMNDFHSSANKSIQGGTSPLQVPY---------FNVD 121 Query 131 NLLFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSASTPGDLSFGLS 190 ++ + G S + F G +L +L YG + + P ++S GL Sbjct 122 SVFNSLNTGKESGSGSTDDLQYKFKYGAF--RLLDLLGYGRKFDS-FGTAYPDNVS-GLK 177 Query 191 KSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYSGGNVFNVFAL 250 N Y ++ Y KIY D++R S +E + ++NFD + GG V Sbjct 178 N--------NLDYNCSVFRILAYNKIYQDYYRNSNYENFDTDSFNFDKFKGGLV------ 223 Query 251 GAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDITTDSATRSYPVSLF 310 D+ + +LF LRY N D F + + LF Sbjct 224 ---DAKVVA-----DLFKLRYRNAQTDYFTNLRQS----------------------QLF 253 Query 311 SSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEKWDSSIKAKTWvgvnnvgsg 370 S + ++ + A D ++ ++ + NF +D + + D Sbjct 254 SFTTAFEDVDNINIAPRDYVKSDGSNF-TRVNFGVDTDSSEGD----------------- 295 Query 371 vlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQIYAHFGVSLSPALSDTC 430 FSV LR A AV K V+ A +T +DQ+ AH+GV + + Sbjct 296 --------------FSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRV 341 Query 431 FRVGGSASNIDISEVVNNN--LAGENQAD------IMgkgvgtgqggTSFSSDEYGVLMA 482 +GG S++ +S+V + A E + + + GKG G+G+G F + E+GVLM Sbjct 342 NYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGRGRIVFDAKEHGVLMC 401 Query 483 IYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSLHFGRFFNYKSDKFTFDPTAS 542 IY +VP + Y T + + D PEF+++GMQ L+ +Y S T DP Sbjct 402 IYSLVPQIQYDCTRLDPMVDKLDRFDYFTPEFENLGMQPLN----SSYISSFCTTDPKNP 457 Query 543 VMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSKWIDSTVSAGQTYYSLNYGFF 602 V+GY PR+ + KT D +G F + + LS W S T+ L F Sbjct 458 VLGYQPRYSEYKTALDVNHGQFAQS----------DALSSWSVSRFRRWTTFPQLEIADF 507 Query 603 KVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNFDYDGMP 646 K++P L+SIF V + + D ++ V + DGMP Sbjct 508 KIDPGCLNSIFPVDYNGTEANDCVYGGCNFNIVKVSDMSVDGMP 551 >gi|575094339|emb|CDL65730.1| unnamed protein product [uncultured bacterium] Length=588 Score = 152 bits (384), Expect = 4e-36, Method: Compositional matrix adjust. Identities = 160/616 (26%), Positives = 266/616 (43%), Gaps = 105/616 (17%) Query 12 RNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPVNTAAYTRIR 71 R + ++GFD+S+R FTS G+LLPV+Y + PGDK +I LFTRTQP+ + A R+ Sbjct 11 RANLSKNGFDMSQRHPFTSSVGQLLPVFYDYLNPGDKIRISANLFTRTQPMKSTAMARLT 70 Query 72 EYLDWYFVPLRLINKNLPQALMNMQDNPVQASGIVSNKIVTSDIPWTLLGHDSKPLGIAN 131 E+++++FVP + + D +S +V + +T +P+ K ++ Sbjct 71 EHIEYFFVPFEQMFSLFGSVFYGIDD--YNSSSLVKHNNLT--MPFF------KSDAVSA 120 Query 132 LLFRMYKGDSSAQIDPVL--NFFGFNSGTLGAKLAMMLRYGNFISKDYSASTP-GDLSFG 188 L Y SS+ VL + G +L+ ML YG+ + + + P D+S Sbjct 121 ALEAAYTSFSSSINRKVLTPDMMGQPRVYGILRLSEMLGYGSLLLSNDNNLLPHADMS-- 178 Query 189 LSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYSGGNVFNVF 248 + F YQKI+ D +R + + +YN D+ G + + Sbjct 179 ------------------VFLFTAYQKIFNDFYRLDDYTSVQHKSYNVDYAQGQPITD-- 218 Query 249 ALGAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDITTDSATRSYPVS 308 N++F L Y W KD F V+PN Sbjct 219 ---------------NSMFELHYRPWKKDYFTNVIPNP---------------------- 241 Query 309 LFSSASGPNGINKLGAAIPDAPFNEATDTPSTYNFSIDFGTEKWDSSIKAKTWvgvnnvg 368 FSS + GA + D P + +++NF G++ + T + Sbjct 242 YFSSVDNKSSFG--GAGLFDRPVGLSI---TSFNFD---GSDFLQAPSDLSTMENNQPIF 293 Query 369 sgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQIYAHFGVSLSPALSD 428 + S +A SV LR A K ++Q A + Q AHFG + +S Sbjct 294 QELPVNLTSAS--SAGLSVSDLRYLYATDKLLRITQFAGKHYDAQTLAHFGKRVPQGVSG 351 Query 429 TCFRVGGSASNIDISEVVNNNL---AGENQADIMgkgvgtgqggT------SFSSDEYGV 479 + +GG + + IS V + +G+ ++G+ G G T SF + +GV Sbjct 352 EVYYIGGQSQPLQISSVESTATTFDSGDVVGSVLGELAGKGYSQTGNQKDFSFEAPCHGV 411 Query 480 LMAIYHVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSLHFGRFFNYKSDKFTFDP 539 LMAIY VP DY+ + ++D PEFDS+GM+ F NY+ D++ Sbjct 412 LMAIYSAVPEADYLDERIDYLNTLIQSNDFYKPEFDSLGMEP-----FPNYELDQYRMVG 466 Query 540 TASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSKWIDSTVSAGQTYYSLNY 599 S +G+ R+ LK+ D + GAF+ TL+ WVA + DS + ++++ + Sbjct 467 NNSRLGWRYRYSGLKSKPDLISGAFKYTLRDWVAVRN--------DSRYAEDESWWQ-SA 517 Query 600 GFFKVNPSVLDSIFKV 615 F ++P+ LD+IF++ Sbjct 518 AFMYIDPAYLDNIFEL 533 >gi|575094297|emb|CDL65693.1| unnamed protein product [uncultured bacterium] Length=630 Score = 142 bits (358), Expect = 2e-32, Method: Compositional matrix adjust. Identities = 169/640 (26%), Positives = 260/640 (41%), Gaps = 92/640 (14%) Query 3 SNLFSFGDVRNHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPV 62 +NLF D R+ FD+S+ + FTS G+LLPV+Y ++ PGDK I+ T+TQP+ Sbjct 4 ANLFKRPDHVAKLGRNVFDMSQTLGFTSSVGQLLPVFYDVLNPGDKISIKSLFVTKTQPM 63 Query 63 NTAAYTRIREYLDWYFVPLRLINKNLPQALMNMQDNPVQASGIVSNKIVTSDIPWT---L 119 + + ++ E +D++FVP I + D S + S K D+ T L Sbjct 64 QSDNFAKVTENVDYFFVPFEQIYSLFGSFFYQIADF---NSSLFSKKGGALDLTSTHLPL 120 Query 120 LGHDSKPLGIANLLFRMYKGDSSAQIDP--VLNFFGFNSGTLGAKLAMMLRYGNFISKDY 177 D + + + +Y D I P L+ +G + +L + N+ + D Sbjct 121 ASFDGLSYELFSSQYDIYSDDDDHIIFPNNTLDEYGVPNYFNHLRLMQLFGMSNYFTSD- 179 Query 178 SASTPGDLSFGLSKSDFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFD 237 AS P K +L LP A YQKI+ D++R W +P +YN D Sbjct 180 -ASQPDQF-----KPSINL----------FLPLA-YQKIFNDYYRLDDWTAPDPTSYNID 222 Query 238 WYSGGNVFNVFALGAADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDIT 297 + F+ AD +Y ++F LRY W KD + + N D Sbjct 223 -----SSFD------ADIIRTSYYR--SIFKLRYRPWKKDYYTNLSRNPYFNASYNAD-- 267 Query 298 TDSATRSYPVSLFSSASGPNGINKLGAAIPDAPF--NEATDTPSTYNFSID--FGTEKWD 353 A GPNG+ L + P+ + D P N + G E Sbjct 268 --------------GAYGPNGMQSLSSLATALPYDTDSVKDNPLVENLGLSKPVGDESEA 313 Query 354 SSIKAKTWvgvnnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQ 413 +IK + +G + Q + +V QLR A K ++Q A + Q Sbjct 314 VTIK-QGIPRSLPFYAGYDSPYLSQEQGIETLNVSQLRALYATDKLLRITQFAGKHYDAQ 372 Query 414 IYAHFGVSLSPALSDTCFRVGGSASNIDISEVV----------NNNLAGENQADIMgkgv 463 AHFG + +S + +GG + + IS + ++ + GE A V Sbjct 373 TLAHFGKKVPQGVSGEVYYLGGQSQRLQISPITALSSGQTSDGSDTVFGEQGA--RAASV 430 Query 464 gtgqggTSFSSDEYGVLMAIYHVVPLLDYVITG--QPHELLYTNTSDLPFPEFDSIGMQS 521 GQ +F + +G+LMAIY VP +Y + + L Y+N D PE D+IGM Sbjct 431 TQGQKPFTFEAPCHGILMAIYSAVPEANYSCDAIDRINTLAYSN--DFYKPELDNIGMSP 488 Query 522 LHFGRF-------FNYKSDKFTFDPTASVMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAP 574 L+ F F ++ D A +G+ R+ KT D GA TL+SW Sbjct 489 LYSYEFSVPGYTLFRNPPTPYSSDDAAQSLGWQFRYSWFKTKVDRTCGALNRTLRSWCPK 548 Query 575 LDPEYLSKWIDSTVSAGQTYYSLNYG-FFKVNPSVLDSIF 613 D YL+ + S NY + V+PS LD +F Sbjct 549 RD--YLALGLQSRPQL------FNYASLYYVSPSYLDGLF 580 >gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317] gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 317 str. F0108] Length=541 Score = 120 bits (300), Expect = 2e-25, Method: Compositional matrix adjust. Identities = 177/644 (27%), Positives = 262/644 (41%), Gaps = 129/644 (20%) Query 13 NHPHRSGFDLSRRICFTSKAGELLPVYYKLVYPGDKFQIRHQLFTRTQPVNTAAYTRIRE 72 N P RS FDLS++ +T+ AG LLPV + D +I+ Q F RT P+N+AA+ +R Sbjct 15 NRP-RSAFDLSQKHLYTAPAGALLPVLSVDLMFHDHIRIQAQDFMRTMPMNSAAFISMRG 73 Query 73 YLDWYFVPLRLINKNLPQALMNMQDNPVQASGIVSNKIVTSDIPWTLLGHDSKPLGIANL 132 +++FVP + Q + +M D S +VS+ + DS P Sbjct 74 VYEFFFVPYSQLWHPYDQFITSMND---YRSSVVSSAAGDKAL-------DSVPNVKLAD 123 Query 133 LFRMYKGDSSAQIDPVLNFFGFNSGTLGAKLAMMLRYGNFISKDYSASTPGDLSFGLSKS 192 +++ + + I FG+ +L +L YG I+ S+ TP L + + + Sbjct 124 MYKFVRERTDKDI------FGYPHSNNSCRLMDLLGYGKPIT---SSKTPVPLLYTGNVN 174 Query 193 DFDLRAYNSSYAVNILPFAVYQKIYADHFRFSQWEKNEPYTYNFDWYSGGNVFNVFALGA 252 F L AYN KIY+D++R + +E + Y++N D G V Sbjct 175 LFRLLAYN--------------KIYSDYYRNTTYEGVDVYSFNIDHKKGTFV------PT 214 Query 253 ADSTLKTYLEGNNLFTLRYANWPKDLFMGVMPNSQLGDVSVVDITTDSATRSYPVSLFSS 312 AD K YL L Y N P D + + P + I +DS FSS Sbjct 215 ADE-FKKYLN------LHYRNAPLDFYTNLRP------TPLFTIGSDS---------FSS 252 Query 313 A---SGPNGINKLGAAIPDAPFNEATDTPSTYNFSI---DFGTEKWDS-SIKA-KTWvgv 364 S P G A A N A +P N S F +K S S++A KT+ Sbjct 253 VLQLSDPTGSAGFSADGNSAKLNMA--SPDVLNVSAIRSAFALDKLLSISMRAGKTY--- 307 Query 365 nnvgsgvlgvtvPGSQLAASFSVLQLRMAEAVQKYREVSQVADQTVRDQIYAHFGVSLSP 424 Q+ A F V VS+ D Q+Y G + Sbjct 308 -------------AEQIEAHFGV-------------TVSEGRD----GQVYYLGGFDSNV 337 Query 425 ALSDTCFRVGGSASNIDISEVVNNNLAGENQADIMgkgvgtgqggTSFSSDEYGVLMAIY 484 + D G +N ++SEV N LAG I GKG G+G G F + E GVLM IY Sbjct 338 QVGDVTQTSG--TTNPNVSEVGNAKLAGY-LGKITGKGTGSGYGEIQFDAKEPGVLMCIY 394 Query 485 HVVPLLDYVITGQPHELLYTNTSDLPFPEFDSIGMQSL--HFGRFFNYKSDKFTFDPTAS 542 VVP + Y + D PEF+++GMQ + F K + + Sbjct 395 SVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVSLNRAKDNSY------- 447 Query 543 VMGYVPRFIDLKTDYDEVYGAFRSTLKSWVAPLDPEYLSKWIDSTVSAGQTYYSLNYGFF 602 G+ PR+ + KT +D +G F + E LS W + T + N Sbjct 448 --GWQPRYSEYKTAFDINHGQF----------ANGEPLSYWSIARARGSDTLNTFNVAAL 495 Query 603 KVNPSVLDSIFKVKADSSMDTDQFLSSLYLDVKAVRNFDYDGMP 646 K+NP LDS+F V + + TD + +++ V + DGMP Sbjct 496 KINPHWLDSVFAVNYNGTEVTDCMFGYAHFNIEKVSDMTEDGMP 539 Lambda K H a alpha 0.319 0.135 0.407 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4903549086528