bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-22_CDS_annotation_glimmer3.pl_2_4 Length=626 Score E Sequences producing significant alignments: (Bits) Value gi|575094354|emb|CDL65742.1| unnamed protein product 367 2e-114 gi|490418709|ref|WP_004291032.1| hypothetical protein 354 7e-110 gi|494822885|ref|WP_007558293.1| hypothetical protein 333 2e-101 gi|496050829|ref|WP_008775336.1| hypothetical protein 275 5e-80 gi|575094321|emb|CDL65708.1| unnamed protein product 225 5e-61 gi|547226430|ref|WP_021963493.1| putative uncharacterized protein 206 1e-54 gi|575094339|emb|CDL65730.1| unnamed protein product 184 7e-47 gi|575094297|emb|CDL65693.1| unnamed protein product 128 9e-28 gi|649555287|gb|KDS61824.1| capsid family protein 124 1e-26 gi|565841287|ref|WP_023924568.1| hypothetical protein 121 1e-25 >gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium] Length=615 Score = 367 bits (941), Expect = 2e-114, Method: Compositional matrix adjust. Identities = 238/652 (37%), Positives = 344/652 (53%), Gaps = 69/652 (11%) Query 6 SYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQPVDTSA 65 S DIKN P R+GFDLS K FTAK GELLPV K LP D FNI+ F RTQP++TSA Sbjct 2 SMADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTSA 61 Query 66 FTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPNYAASSSA--SISFNRNLPWIDLATIN 123 F R+REYY+++FVP ++ + I M +A+ + + + +P+ I Sbjct 62 FARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFTSEQIA 121 Query 124 TAIGNVQSSASPNNFFGVLRSEGFKKLVSYLGYGET----------SPEKYVDNLRCSAL 173 + N Q++A+ N FG RS KL+ YLGYG+ S + + NL S Sbjct 122 DYL-NDQATAARKNPFGFNRSTLTCKLLQYLGYGDYNSFDSETNTWSAKPLLYNLELSPF 180 Query 174 PLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNPND-SVFELRY 232 PL AYQKIY D+YR++QWEK+ P T+N D+ G +DL +D + F++RY Sbjct 181 PLLAYQKIYSDFYRYTQWEKTNPSTFNLDYIKGTSDL----QMDLTGLPSDDNNFFDIRY 236 Query 233 ANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYGDVNGYASDY 292 N+ KD++ G LP +Q+G + V ++G + V+ +G Sbjct 237 CNYQKDMFHGVLPVAQYGSASVVP-----------------INGQLNVISNGDSGPIFKT 279 Query 293 AAGIRDGGINGAPDNGQTATAYPS--GNLPSDYPYFYAKGSSKTPVGSIANPAHISGSDL 350 + PD G T+Y + GN+ D F GS+ VG A+P+ G Sbjct 280 S----------TPDPGTPGTSYVTVGGNIGVDNRSFGVSGSTLN-VGKSADPSGY-GFPS 327 Query 351 NAQVSGQL----------NAQF--SVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVS 398 NA L N F +L LR AE LQKWKE++ + ++Y +Q++ H+G+ Sbjct 328 NASTRSLLWENPNLIIENNQGFYVPILALRQAEFLQKWKEVSVSGEEDYKSQIEKHWGIK 387 Query 399 TNPMQAHRSTRICGFDGSIDISAVENTNLTSDEAI-IRGKG--LGGQRINDPSNFTCTEH 455 + +H++ + G S+DI+ V N N+T D A I GKG G I S E+ Sbjct 388 VSDFLSHQARYLGGCATSLDINEVINNNITGDNAADIAGKGTFTGNGSIRFESK---GEY 444 Query 456 GIIMCIYHATPLLDYVPTGPDLQLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGDV 515 GIIMCIYH P++DYV +G D T V SFP+PE D +GMES+P++ +N D Sbjct 445 GIIMCIYHVLPIVDYVGSGVD-HSCTLVDATSFPIPELDQIGMESVPLVRAMNPVKESDT 503 Query 516 -VARSYAGYVPRYISWKTSTDVVRGAFTDTLKSWVAPVDLDYMKAFFARNTDDSTIAENV 574 A ++ GY PRYI WKTS D G F D+L++W PV + + + N + E Sbjct 504 PSADTFLGYAPRYIDWKTSVDRSVGDFADSLRTWCLPVGDKELTSANSLNFPSNPNVEPD 563 Query 575 LLTYSWFKINPSVLNPIFGVAVDSSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 + +FK+NPS+++P+F V DS+ TD+ LC+ F+VKV RNL +G+PY Sbjct 564 SIAAGFFKVNPSIVDPLFAVVADSTVKTDEFLCSSFFDVKVVRNLDVNGLPY 615 >gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii] gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 20697] Length=578 Score = 354 bits (908), Expect = 7e-110, Method: Compositional matrix adjust. Identities = 244/659 (37%), Positives = 337/659 (51%), Gaps = 114/659 (17%) Query 1 MSSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQP 60 M+++ S I+N P R+GFDLS K FTAK GELLPV K LP D F I+ + F RTQP Sbjct 1 MANIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPNYAASSSASISF--NRNLPWID 118 V+T+AF RIREYY++FFVP LL+ +N + M + P +A S + +F + +P++ Sbjct 61 VNTAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDNPQHAVSIDPTRNFVLSGEMPYMT 120 Query 119 LATINTAIGNVQSSAS-----PNNFFGVLRSEGFKKLVSYLGYG----------ETSPEK 163 I + I N S+AS +N+FG RS+ KL+ YLGYG T+P Sbjct 121 SEAIASYI-NALSTASALADYKSNYFGYNRSKSSVKLLEYLGYGNYESFLTDDWNTAP-- 177 Query 164 YVDNLRCSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNP 223 + NL + L AYQKIY D+YR SQWE+ P T+N D+ +G + F QN Sbjct 178 LMANLNHNIFGLLAYQKIYSDFYRDSQWERVSPSTFNVDYLDGSSMNLDNAYSTEFYQNY 237 Query 224 NDSVFELRYANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYG 283 N F+LRY NW KDL+ G LP+ Q+G+ A S+ T DV+G + + Sbjct 238 N--FFDLRYCNWQKDLFHGVLPHQQYGETAVASI-------------TPDVTGKLTL--- 279 Query 284 DVNGYASDYAAGIRDGGINGAPDNGQTATAYPSGNLPSDYPYFYAKGSSKTPVGSIANPA 343 S+++ + +P TA+ + NL PA Sbjct 280 ------SNFST------VGTSP---TTASGTATKNL----------------------PA 302 Query 344 HISGSDLNAQVSGQLNAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQ 403 + DL S+L LR AE LQKWKEI Q+ ++Y Q++ H+GVS Sbjct 303 FDTVGDL------------SILVLRQAEFLQKWKEITQSGNKDYKDQLEKHWGVSVGDGF 350 Query 404 AHRSTRICGFDGSIDISAVENTNLT-SDEAIIRGKGLG--GQRINDPSNFTCTEHGIIMC 460 + T + G SIDI+ V NTN+T S A I GKG+G IN SN +G+IMC Sbjct 351 SELCTYLGGVSSSIDINEVINTNITGSAAADIAGKGVGVANGEINFNSN---GRYGLIMC 407 Query 461 IYHATPLLDYVPTGPDLQLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGDVVARSY 520 IYH PLLDY D + V + +PEFD +GM+S+P++ L+N RS+ Sbjct 408 IYHCLPLLDYTTDMLDPAFL-KVNSTDYAIPEFDRVGMQSMPLVQLMNP-------LRSF 459 Query 521 A-------GYVPRYISWKTSTDVVRGAFTDTLKSWVAPV-DLDYMKAFFARN-----TDD 567 A GYVPRYI +KTS D G F TL SWV ++ +K N Sbjct 460 ANASGLVLGYVPRYIDYKTSVDQSVGGFKRTLNSWVISYGNISVLKQVTLPNDAPPIEPS 519 Query 568 STIAENVLLTYSWFKINPSVLNPIFGVAVDSSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 + + +++FK+NP L+PIF V NTDQ LC+ F++K RNL DG+PY Sbjct 520 EPVPSVAPMNFTFFKVNPDCLDPIFAVQAGDDTNTDQFLCSSFFDIKAVRNLDTDGLPY 578 >gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius] gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 17135] Length=613 Score = 333 bits (854), Expect = 2e-101, Method: Compositional matrix adjust. Identities = 214/651 (33%), Positives = 340/651 (52%), Gaps = 70/651 (11%) Query 1 MSSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQP 60 M+++ S ++N P R+G+DL+ K FTAK G L+PV+W LP D N + + F RTQP Sbjct 8 MANIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQP 67 Query 61 VDTSAFTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPNYAASS--SASISFNRNLPWID 118 ++T+AF R+R Y++++FVP ++ AI M +A+ + ++ + LP+ Sbjct 68 LNTAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNVPLSDELPYF- 126 Query 119 LATINTAIGNVQSSASPNNFFGVLRSEGFKKLVSYLGYGETSP---------------EK 163 T + S A N FG R+ ++ YLGYG+ P Sbjct 127 --TAEQVADYIVSLADSKNQFGYYRAWLVCIILEYLGYGDFYPYIVEAAGGEGATWATRP 184 Query 164 YVDNLRCSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGE-DSTPVASSLDLFSQN 222 ++NL+ S PL+AYQKIY D+ R++QWE+S P T+N D+ +G DS + +++ F + Sbjct 185 MLNNLKFSPFPLFAYQKIYADFNRYTQWERSNPSTFNIDYISGSADSLQLDFTVEGFKDS 244 Query 223 PNDSVFELRYANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVY 282 N +F++RY+NW +DL G++P +Q+G+ +AV VSG M VV Sbjct 245 FN--LFDMRYSNWQRDLLHGTIPQAQYGEASAVP-----------------VSGSMQVV- 284 Query 283 GDVNGYASDYAAGIRDGGINGAPDNGQTATAYPSGNLPSDYPYFYAKGSSKTPVGSIANP 342 +G A GQ A+ +GN+ Y + ++T VG + Sbjct 285 ---------------EGPTPPAFTTGQDGVAFLNGNVTIQGSSGYLQ--AQTSVGE-SRI 326 Query 343 AHISGSDLNAQVSGQLNAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPM 402 + ++ V G + S+L LR AEA QKWKE+A A+ ++Y +Q++AH+G S N Sbjct 327 LRFNNTNSGLIVEGDSSFGVSILALRRAEAAQKWKEVALASEEDYPSQIEAHWGQSVNKA 386 Query 403 QAHRSTRICGFDGSIDISAVENTNLTSDEAI-IRGKG-LGGQRINDPSNFTCT-EHGIIM 459 + + + + I+ V N N+T + A I GKG + G N NF ++GI+M Sbjct 387 YSDMCQWLGSINIDLSINEVVNNNITGENAADIAGKGTMSG---NGSINFNVGGQYGIVM 443 Query 460 CIYHATPLLDYVPTGPDLQLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGD----V 515 C++H P LDY+ + P T FP+PEFD +GME +P++ +N D V Sbjct 444 CVFHVLPQLDYITSAPHFG-TTLTNVLDFPIPEFDKIGMEQVPVIRGLNPVKPKDGDFKV 502 Query 516 VARSYAGYVPRYISWKTSTDVVRGAFTDTLKSWVAPVDLDYMKAFFARNTDDSTIAENVL 575 Y GY P+Y +WKT+ D G F +LK+W+ P D + + A + + D+ E Sbjct 503 SPNLYFGYAPQYYNWKTTLDKSMGEFRRSLKTWIIPFDDEALLAADSVDFPDNPNVEADS 562 Query 576 LTYSWFKINPSVLNPIFGVAVDSSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 + +FK++PSVL+ +F V +S NTDQ LC+ F+V V R+L +G+PY Sbjct 563 VKAGFFKVSPSVLDNLFAVKANSDLNTDQFLCSTLFDVNVVRSLDPNGLPY 613 >gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4] gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4] Length=580 Score = 275 bits (703), Expect = 5e-80, Method: Compositional matrix adjust. Identities = 218/641 (34%), Positives = 327/641 (51%), Gaps = 76/641 (12%) Query 1 MSSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQP 60 M+++ S ++N R+GFDLS+K FTAK GELLPV LP DK++I + F RTQP Sbjct 1 MANIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEVLPGDKWSIDLKSFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPNYAASSSASISFNRNLPWIDLA 120 ++T+AF R+REYY+++FVP +LL+ +N + M + P +A +S S N+ L + Sbjct 61 LNTAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHA--TSYIPSANQALAGVMPN 118 Query 121 TINTAIGNVQSSASPNNFFGVLRSEGFKKLVSYLGYGETSPEKYVDNLRCSALPLYAYQK 180 I + + +P+ V + ++K +Y GY + L + L Y Sbjct 119 VTCKGIADYLNLVAPD----VTTTNSYEK--NYFGYSRS--------LGTAKLLEYLG-- 162 Query 181 IYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNPNDSVFELRYANWNKDLW 240 Y ++Y ++ K+ WT +P++S+L L + L Y + ++ Sbjct 163 -YGNFYTYAT-SKNNTWT----------KSPLSSNLQL------NIYGVLAY----QKIY 200 Query 241 MGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYGDV-NGYASDYAAGIRDG 299 + +SQ+ V+ +D + + T D S + G + N + Y +D Sbjct 201 ADHIRDSQWEKVSPSCFNVDYLSGTVDSAMTID-SMITGQGFAPFYNMFDLRYCNWQKDL 259 Query 300 GINGAPDNGQTATAYPSGNLPSDYPYFYAKGSSKTPVGSIANPAHISGSDLNAQ-VSGQL 358 P TA + NL + A+ +TP G + S + +N Q V+G Sbjct 260 FHGVLPRQQYGDTAAVNVNLSN---VLSAQYMVQTPDGDPVGGSPFSSTGVNLQTVNG-- 314 Query 359 NAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQAHRSTRICGFDGSID 418 + F+VL LR AE LQKWKEI Q+ ++Y Q++ H+ VS + S + G S+D Sbjct 315 SGTFTVLALRQAEFLQKWKEITQSGNKDYKDQIEKHWNVSVGEAYSEMSLYLGGTTASLD 374 Query 419 ISAVENTNLT-SDEAIIRGKGL--GGQRINDPSNFTCTE-HGIIMCIYHATPLLDYVPTG 474 I+ V N N+T S+ A I GKG+ G RI+ F E +G+IMCIYH+ PLLDY Sbjct 375 INEVVNNNITGSNAADIAGKGVVVGNGRIS----FDAGERYGLIMCIYHSLPLLDYTT-- 428 Query 475 PDL--QLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGDVVARSYAGYVPRYISWKT 532 DL T + F +PEFD +GMES+P++SL+N V S GY PRYIS+KT Sbjct 429 -DLVNPAFTKINSTDFAIPEFDRVGMESVPLVSLMNPLQSSYNVGSSILGYAPRYISYKT 487 Query 533 STDVVRGAFTDTLKSWVAPVD-------LDYMKAFFARNTDDSTIAENVLLTYSWFKINP 585 D GAF TLKSWV D L+Y DD + L+ Y+ FK+NP Sbjct 488 DVDSSVGAFKTTLKSWVMSYDNQSVINQLNYQ--------DDPNNSPGTLVNYTNFKVNP 539 Query 586 SVLNPIFGVAVDSSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 + ++P+F VA +S +TDQ LC+ F+VKV RNL DG+PY Sbjct 540 NCVDPLFAVAASNSIDTDQFLCSSFFDVKVVRNLDTDGLPY 580 >gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium] Length=642 Score = 225 bits (574), Expect = 5e-61, Method: Compositional matrix adjust. Identities = 205/694 (30%), Positives = 304/694 (44%), Gaps = 127/694 (18%) Query 2 SSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQPV 61 S++ +KN P R+ FDLS++ FTAKVGELLP + + P D +S +F RT P+ Sbjct 5 SNIMGLHGLKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPL 64 Query 62 DTSAFTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPN------YAASSSASISFNRNLP 115 ++AFTR+RE ++FFVP L++ + +++M N A+S + +P Sbjct 65 QSNAFTRLRENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVTTQMP 124 Query 116 WIDLAT--------INTAIGNVQSSASPNNFFGVLRSEGFKKLVSYLGYGETSPEKY--- 164 ++ T IN + S P G R KL+ LGYG PE++ Sbjct 125 CVNYKTLHAYLLKFINRSTVGSDGSVGPEFNRGCYRHAESAKLLQLLGYGNF-PEQFANF 183 Query 165 -VDNLR------------------CSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWN 205 V+N + S L AY KI D+Y + QW+ YN N Sbjct 184 KVNNDKHNQSGQNFKDVTYNNSPYLSIFRLLAYHKICNDHYLYRQWQP-----YNASLCN 238 Query 206 GEDSTPVASSL----DLFSQNPNDSV-------FELRYANWNKDLWMGSLPNSQFGDVAA 254 + TP +SSL D P+DS+ ++R++N D + G LP SQFG + Sbjct 239 VDYLTPNSSSLLSIDDALLSIPDDSIKAEKLNLLDMRFSNLPLDYFTGVLPTSQFGSESV 298 Query 255 VSLGLDASTMKIGVTGTADVSGMMGVVYGDVNGYASDYAAGIRDGGINGAPDNGQTATAY 314 V+L L G A S ++ NG S + R G + Q + Sbjct 299 VNLNL----------GNASGSAVL-------NGTTSKDSGRWRT--TTGEWEMEQRVASS 339 Query 315 PSGNLPSDYPYFYAKGSSKTPVGSIANPAHISGSDLNAQVSGQLNAQFSVLQLRAAEALQ 374 +GNL D T G++A +N +SG L S++ LR A A Q Sbjct 340 ANGNLKLDNSNGTFISHDHTFSGNVA---------INTSLSGNL----SIIALRNALAAQ 386 Query 375 KWKEIAQANGQNYAAQVKAHFGVSTNPMQAHRSTRICGFDGSIDISAVENTNLTSDEAII 434 K+KEI AN ++ +QV+AHFG+ + + S I G I+I+ N NL+ D Sbjct 387 KYKEIQLANDVDFQSQVEAHFGIKPDE-KNENSLFIGGSSSMININEQINQNLSGDNKAT 445 Query 435 RG---KGLGGQRINDPSNFTCTEHGIIMCIYHATPLLDYVPTGPDLQLMTTVKGESFPVP 491 G +G G I FT +G+++ IY TP+LD+ G D L T F +P Sbjct 446 YGAAPQGNGSASI----KFTAKTYGVVIGIYRCTPVLDFAHLGIDRTLFKT-DASDFVIP 500 Query 492 EFDSLGMESL---------PMLSLVNSKAIGDV----VARSYAGYVPRYISWKTSTDVVR 538 E DS+GM+ P + +GD ++ +Y GY PRY +KTS D Sbjct 501 EMDSIGMQQTFRCEVAAPAPYNDEFKAFRVGDGSSPDMSETY-GYAPRYSEFKTSYDRYN 559 Query 539 GAFTDTLKSWVAPVDLDYMKAFFARNTDDSTIAENVLLTYS------WFKINPSVLNPIF 592 GAF +LKSWV ++ D I NV T++ F P ++ +F Sbjct 560 GAFCHSLKSWVTGINFD-------------AIQNNVWNTWAGINAPNMFACRPDIVKNLF 606 Query 593 GVAVDSSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 V+ ++ + DQL RNLS G+PY Sbjct 607 LVSSTNNSDDDQLYVGMVNMCYATRNLSRYGLPY 640 >gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185] gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185] Length=573 Score = 206 bits (524), Expect = 1e-54, Method: Compositional matrix adjust. Identities = 115/269 (43%), Positives = 160/269 (59%), Gaps = 6/269 (2%) Query 360 AQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQAHRSTRICGFDGSIDI 419 A SVL LR AE LQKW+EIAQ+ +Y Q++ HF VS + + + G+ ++DI Sbjct 309 AGLSVLALRQAECLQKWREIAQSGKMDYQTQMQKHFNVSPSATLSGHCKYLGGWTSNLDI 368 Query 420 SAVENTNLTSD-EAIIRGKGLGGQRINDPSNFTCTEHGIIMCIYHATPLLDYVPTGPDLQ 478 S V NTNLT D +A I+GKG G N +F +EHGIIMCIYH PLLD+ Q Sbjct 369 SEVVNTNLTGDNQADIQGKGTGTLNGNK-VDFESSEHGIIMCIYHCLPLLDWSINRIARQ 427 Query 479 LMTTVKGESFPVPEFDSLGMESL-PMLSLVNSKAIGDVVARSYAGYVPRYISWKTSTDVV 537 T + + +PEFDS+GM+ L P + + + + GYVPRY KTS D + Sbjct 428 NFKTTFTD-YAIPEFDSVGMQQLYPSEMIFGLEDLPSDPSSINMGYVPRYADLKTSIDEI 486 Query 538 RGAFTDTLKSWVAPVDLDYMKAFFARNTDDSTIAENVLLTYSWFKINPSVLNPIFGVAVD 597 G+F DTL SWV+P+ Y+ A+ R ++ +TY++FK+NP +++ IFGV D Sbjct 487 HGSFIDTLVSWVSPLTDSYISAY--RQACKDAGFSDITMTYNFFKVNPHIVDNIFGVKAD 544 Query 598 SSWNTDQLLCNCQFNVKVARNLSYDGMPY 626 S+ NTDQLL N F++K RN Y+G+PY Sbjct 545 STINTDQLLINSYFDIKAVRNFDYNGLPY 573 Score = 194 bits (494), Expect = 1e-50, Method: Compositional matrix adjust. Identities = 110/270 (41%), Positives = 164/270 (61%), Gaps = 17/270 (6%) Query 1 MSSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQP 60 MSS+ S +KN+ +R+GFDLS K AFTAKVGELLP+ K P DKFNI + F RTQP Sbjct 1 MSSVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLLYRNSNEAIMSMENQPNYAASSSASISFNRNLPWIDLA 120 V+++A++R+REYY+++FVP LL+ + +M + P++AA +S++ ++ PW Sbjct 61 VNSAAYSRLREYYDFYFVPYRLLWNMAPTFFTNMPD-PHHAADLVSSVNLSQRHPWFTFF 119 Query 121 TINTAIGNVQSSASP-----NNFFGVLRSEGFKKLVSYLGYGETSPEKYV------DNLR 169 I +GN+ S + NFFG R E KL++YL YG + V D++ Sbjct 120 DIMEYLGNLNSLSGAYEKYQKNFFGFSRVELSVKLLNYLNYGFGKDYESVKVPSDSDDIV 179 Query 170 CSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDS---TPVASSLDLFSQNPNDS 226 S PL AYQKI +DY+R QW+ + P+ YN D+ G+ S P++S + +NP + Sbjct 180 LSPFPLLAYQKICEDYFRDDQWQSAAPYRYNLDYLYGKSSGFHIPMSSFTNDAFKNP--T 237 Query 227 VFELRYANWNKDLWMGSLPNSQFGDVAAVS 256 +F+L Y N+ KD + G LP +Q+GDV+ S Sbjct 238 MFDLNYCNFQKDYFTGMLPRAQYGDVSVAS 267 >gi|575094339|emb|CDL65730.1| unnamed protein product [uncultured bacterium] Length=588 Score = 184 bits (467), Expect = 7e-47, Method: Compositional matrix adjust. Identities = 175/636 (28%), Positives = 268/636 (42%), Gaps = 112/636 (18%) Query 16 RSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQPVDTSAFTRIREYYEW 75 ++GFD+S + FT+ VG+LLPV++ + P DK IS F RTQP+ ++A R+ E+ E+ Sbjct 16 KNGFDMSQRHPFTSSVGQLLPVFYDYLNPGDKIRISANLFTRTQPMKSTAMARLTEHIEY 75 Query 76 FFVPLHLLYRNSNEAIMSMENQPNYAASSSASISFNRNLPWIDLATINTAIGNVQSSASP 135 FFVP ++ +++ SSS N +P+ ++ A+ +S S Sbjct 76 FFVPFEQMFSLFGSVFYGIDD----YNSSSLVKHNNLTMPFFKSDAVSAALEAAYTSFSS 131 Query 136 N--------NFFGVLRSEGFKKLVSYLGYGETSPEKYVDNLRCSALPLY---AYQKIYQD 184 + + G R G +L LGYG + L + + ++ AYQKI+ D Sbjct 132 SINRKVLTPDMMGQPRVYGILRLSEMLGYGSLLLSNDNNLLPHADMSVFLFTAYQKIFND 191 Query 185 YYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNPNDSVFELRYANWNKDLWMGSL 244 +YR + + +YN D+ G+ T ++S+FEL Y W KD + + Sbjct 192 FYRLDDYTSVQHKSYNVDYAQGQPIT-------------DNSMFELHYRPWKKDYFTNVI 238 Query 245 PNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYGDVNGYASDYAAGIRDGGINGA 304 PN F V S G G D + + + +G SD+ A Sbjct 239 PNPYFSSVDNKS--------SFGGAGLFDRPVGLSITSFNFDG--SDFLQ---------A 279 Query 305 PDNGQTATAYPSGNLPSDYPYFYAKGSSKTPVGSIANPAHISGSDLNAQVSGQLNAQFSV 364 P + T + ++ P F +L ++ +A SV Sbjct 280 PSDLST--------MENNQPIF---------------------QELPVNLTSASSAGLSV 310 Query 365 LQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQAHRSTRICGFDGSIDISAVEN 424 LR A K I Q G++Y AQ AHFG + I G + IS+VE+ Sbjct 311 SDLRYLYATDKLLRITQFAGKHYDAQTLAHFGKRVPQGVSGEVYYIGGQSQPLQISSVES 370 Query 425 TNLTSDEAIIRGKGLG---GQRINDPSN-----FTCTEHGIIMCIYHATPLLDYVPTGPD 476 T T D + G LG G+ + N F HG++M IY A P DY+ D Sbjct 371 TATTFDSGDVVGSVLGELAGKGYSQTGNQKDFSFEAPCHGVLMAIYSAVPEADYLDERID 430 Query 477 LQLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGDVVARSYAGYVPRYISWKTSTDV 536 L T ++ F PEFDSLGME P L + +G+ S G+ RY K+ D+ Sbjct 431 Y-LNTLIQSNDFYKPEFDSLGMEPFPNYELDQYRMVGN---NSRLGWRYRYSGLKSKPDL 486 Query 537 VRGAFTDTLKSWVAPVDLDYMKAFFARNTDDSTIAENVLLTYSWFK------INPSVLNP 590 + GAF TL+ WVA RN DS AE+ SW++ I+P+ L+ Sbjct 487 ISGAFKYTLRDWVA-----------VRN--DSRYAEDE----SWWQSAAFMYIDPAYLDN 529 Query 591 IFGVAVDSSWNTDQLLCNCQFN-VKVARNLSYDGMP 625 IF ++ Q N ++ + R+L Y P Sbjct 530 IFELSFTPRLYQQQDSANVTYDGTFIDRSLVYQRDP 565 >gi|575094297|emb|CDL65693.1| unnamed protein product [uncultured bacterium] Length=630 Score = 128 bits (321), Expect = 9e-28, Method: Compositional matrix adjust. Identities = 145/583 (25%), Positives = 228/583 (39%), Gaps = 95/583 (16%) Query 16 RSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQPVDTSAFTRIREYYEW 75 R+ FD+S FT+ VG+LLPV++ P DK +I + +TQP+ + F ++ E ++ Sbjct 18 RNVFDMSQTLGFTSSVGQLLPVFYDVLNPGDKISIKSLFVTKTQPMQSDNFAKVTENVDY 77 Query 76 FFVPLHLLYRNSNEAIMSMENQPNYAASSSASISFNRNLPWIDLATINTAIGNVQSSAS- 134 FFVP E I S+ Y + S F++ +DL + + + + + Sbjct 78 FFVPF--------EQIYSLFGSFFYQIADFNSSLFSKKGGALDLTSTHLPLASFDGLSYE 129 Query 135 ------------------PNN---------FFGVLRSEGFKKLVSYLGYGETSPEKYVDN 167 PNN +F LR + +Y + P+++ + Sbjct 130 LFSSQYDIYSDDDDHIIFPNNTLDEYGVPNYFNHLRLMQLFGMSNYFTSDASQPDQFKPS 189 Query 168 LRCSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNPNDSV 227 + LPL AYQKI+ DYYR W P +YN D + D+ + S+ Sbjct 190 INL-FLPL-AYQKIFNDYYRLDDWTAPDPTSYNID---------SSFDADIIRTSYYRSI 238 Query 228 FELRYANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYGDVNG 287 F+LRY W KD + N F S D + G G +S + + D + Sbjct 239 FKLRYRPWKKDYYTNLSRNPYFN----ASYNADGA---YGPNGMQSLSSLATALPYDTDS 291 Query 288 YASDYAAGIRDGGINGAPDNGQTATAYPSGNLPSDYPYFYAKGSSKTPVGSIANPAHISG 347 + + + G++ + A G +P P++ +G Sbjct 292 VKDN--PLVENLGLSKPVGDESEAVTIKQG-IPRSLPFY-------------------AG 329 Query 348 SDLNAQVSGQLNAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQAHRS 407 D Q +V QLRA A K I Q G++Y AQ AHFG + Sbjct 330 YDSPYLSQEQGIETLNVSQLRALYATDKLLRITQFAGKHYDAQTLAHFGKKVPQGVSGEV 389 Query 408 TRICGFDGSIDISAVE--NTNLTSD--EAIIRGKGLGGQRIND---PSNFTCTEHGIIMC 460 + G + IS + ++ TSD + + +G + P F HGI+M Sbjct 390 YYLGGQSQRLQISPITALSSGQTSDGSDTVFGEQGARAASVTQGQKPFTFEAPCHGILMA 449 Query 461 IYHATPLLDYVPTGPDLQLMTTVKGESFPVPEFDSLGME-------SLPMLSLVNSKAI- 512 IY A P +Y D ++ T F PE D++GM S+P +L + Sbjct 450 IYSAVPEANYSCDAID-RINTLAYSNDFYKPELDNIGMSPLYSYEFSVPGYTLFRNPPTP 508 Query 513 --GDVVARSYAGYVPRYISWKTSTDVVRGAFTDTLKSWVAPVD 553 D A+S G+ RY +KT D GA TL+SW D Sbjct 509 YSSDDAAQS-LGWQFRYSWFKTKVDRTCGALNRTLRSWCPKRD 550 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 124 bits (310), Expect = 1e-26, Method: Compositional matrix adjust. Identities = 143/567 (25%), Positives = 216/567 (38%), Gaps = 103/567 (18%) Query 1 MSSLFSYGDIKNTPRRSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQP 60 M+++F+ +K PRR+ F+LS + T GEL+P+ K +P DKF ++ E R P Sbjct 1 MANIFNSVKLKR-PRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAP 59 Query 61 VDTSAFTRIREYYEWFFVPLHLLYRN-----------SNEAIMSMENQPNYAASSSASIS 109 + R+ + +FFVP L++ ++ + + P+ +++A S Sbjct 60 LVAPMMHRVDVFTHYFFVPNRLIWNKWEDFITKGVDGTDSPVFPTYSFPSTVDTANAHNS 119 Query 110 FNRNLPW--IDLATINTAIGNVQSSASPNNFFGVLRSEGFKKLVSYLGYGETSPEKYVDN 167 F W + L +IN V SPN GV GFK Sbjct 120 FGDGSLWDYLGLPSINQIGEAVFQVQSPN---GVKAPAGFK------------------- 157 Query 168 LRCSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNPNDSV 227 SALP AY IY +YYR T + G PV SSL Sbjct 158 --VSALPFRAYHLIYNEYYRDQNLTSELEITLDS----GNYQLPVNSSL----------- 200 Query 228 FELRYANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYGDVNG 287 ++L W KD + +LP Q G V + G ++ M G Sbjct 201 WQLHRRAWEKDYFTSALPWVQRGPEVTVP-----------INGGGEIPVEMK------EG 243 Query 288 YASDYAAGIRDGGINGAPDNGQTATAYPSGNLPSDYPYFYAKGS--SKTPVGSIANPAHI 345 +A+ Q T +P S Y+ S S +GSI A I Sbjct 244 FAA------------------QKITTFPDRKPISGSEVLYSAPSVLSYGQIGSIKGQALI 285 Query 346 SGSDLNAQVSGQLNAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGVSTNPMQAH 405 + V ++ +R + ALQ+W E +G Y Q+ +HFGV ++ + Sbjct 286 EPDNF---VVNTDQMGVNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQ 342 Query 406 RSTRICGFDGSIDISAV---ENTNLTSDEAIIRGKGLGGQRINDPSNFTCTEHGIIMCIY 462 R + G I +S V +T+ TS +A + G G+ +N EHG IM I Sbjct 343 RPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAG-VNHGFTRYFEEHGYIMGIM 401 Query 463 HATPLLDYVPTGP-DLQLMTTVKGESFPVPEFDSLGMESLPMLSLVNSKAIGDVVARSYA 521 P Y P D + + F PEF LG + + L +++ D Sbjct 402 SIRPRTGYQQGVPKDFRKFDNM---DFYFPEFAHLGEQEIKNEELYLNES--DAANEGTF 456 Query 522 GYVPRYISWKTSTDVVRGAFTDTLKSW 548 GY PRY +K S + V G F + W Sbjct 457 GYTPRYAEYKYSQNEVHGDFRGNMAFW 483 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 121 bits (304), Expect = 1e-25, Method: Compositional matrix adjust. Identities = 131/542 (24%), Positives = 211/542 (39%), Gaps = 135/542 (25%) Query 129 VQSSASPNNFFGVLRSEGFKKLVSYLGYGETSPEKYVD---------------------- 166 V S S + G L G +L+ +LGYG + VD Sbjct 201 VVSKLSSKDALGYLYKFGAFRLLHFLGYGVDNNGFIVDFNASYAAGTGEIVKNVLAKKTY 260 Query 167 ---NLRCSALPLYAYQKIYQDYYRHSQWEKSKPWTYNCDFWNGEDSTPVASSLDLFSQNP 223 +++ + L AYQ+IY D+YR+ WE ++P +N D+ +S ++ L Sbjct 261 KLPDIKANVFRLLAYQRIYNDFYRNDLWEAAQPDVFNVDWCCNNNSLDISDELVY----- 315 Query 224 NDSVFELRYANWNKDLWMGSLPNSQFGDVAAVSLGLDASTMKIGVTGTADVSGMMGVVYG 283 + +LRY +W+KD + P + + Sbjct 316 --KMCQLRYRHWSKDWVTSAYPTASY---------------------------------- 339 Query 284 DVNGYASDYAAGIRDGGINGAPDNGQTATAYPSGNLPSDYPYFYAKGS----SKTPVGSI 339 D GI PD T + + + D +GS GS+ Sbjct 340 --------------DKGIFELPDYINGNTGFATTEVKRD--VVNNRGSQLEIKSMDAGSL 383 Query 340 A--NPAHISGSDLNAQVSGQLNAQFSVLQLRAAEALQKWKEIAQANGQNYAAQVKAHFGV 397 N ++IS +D+ A + + + + RAA L +Y+ Q+ AHFG Sbjct 384 GSNNISYISPNDIRAMFALEKMLE----RTRAANGL------------DYSNQIAAHFGF 427 Query 398 STNPMQAHRSTRICGFDGSIDISAVENTNLTSDEAI---------IRGKGLGGQRINDPS 448 + + ++ I GFD I IS V T+ S + + GKG+G S Sbjct 428 KVPESRKNCASFIGGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAMNSGHIS 487 Query 449 NFTCTEHGIIMCIYHATPLLDYVPTGPDLQLMTTVKGESFPVPEFDSLGMESLPM----L 504 + EHG+IMCIY P +DY D E + PEF++LGM+ + L Sbjct 488 -YDVKEHGLIMCIYSIAPQVDYDARELD-PFNRKFSREDYFQPEFENLGMQPVIQSDLCL 545 Query 505 SLVNSKAIGDVVARSYAGYVPRYISWKTSTDVVRGAFTD--TLKSWVAPVDLDYMKAFFA 562 + ++K+ + GY RY+ +KT+ D++ G F +L +W P + +Y F Sbjct 546 CINSAKSDSSDQHNNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPKN-NYTFEFGK 604 Query 563 RNTDDSTIAENVLLTYSWFKINPSVLNPIFGVAVDSSWNTDQLLCNCQFNVKVARNLSYD 622 + D ++P VL PIF V + S +TDQ L N F+VK R + + Sbjct 605 LSLPD-------------LLVDPKVLEPIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVN 651 Query 623 GM 624 M Sbjct 652 DM 653 Score = 69.7 bits (169), Expect = 4e-09, Method: Compositional matrix adjust. Identities = 36/110 (33%), Positives = 60/110 (55%), Gaps = 3/110 (3%) Query 16 RSGFDLSNKCAFTAKVGELLPVYWKFCLPADKFNISQEWFARTQPVDTSAFTRIREYYEW 75 R+G+DLS++ F+A G LLP+ P +KF IS + R QP++T+AF R +EYY + Sbjct 15 RNGYDLSSRRIFSAPAGALLPIATWEANPGEKFRISVQDLVRAQPLNTAAFARCKEYYHF 74 Query 76 FFVPLHLLYRNSNEAIMSMENQPNYAASSSASISFN---RNLPWIDLATI 122 FFVP L+++S+ + + + FN + +P +L + Sbjct 75 FFVPYKSLWQHSDRFFTGVTEGDSAFSKPDGKTDFNFVPKTVPMFNLKDV 124 Lambda K H a alpha 0.317 0.132 0.407 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4727351115384