bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-14_CDS_annotation_glimmer3.pl_2_7 Length=222 Score E Sequences producing significant alignments: (Bits) Value gi|547226430|ref|WP_021963493.1| putative uncharacterized protein 175 1e-47 gi|575094354|emb|CDL65742.1| unnamed protein product 157 4e-41 gi|496050829|ref|WP_008775336.1| hypothetical protein 141 2e-35 gi|490418709|ref|WP_004291032.1| hypothetical protein 139 1e-34 gi|494822885|ref|WP_007558293.1| hypothetical protein 116 2e-26 gi|517172762|ref|WP_018361580.1| hypothetical protein 89.0 7e-17 gi|575094321|emb|CDL65708.1| unnamed protein product 85.1 2e-15 gi|494306153|ref|WP_007173049.1| hypothetical protein 82.8 7e-15 gi|565841287|ref|WP_023924568.1| hypothetical protein 82.4 1e-14 gi|494308783|ref|WP_007173938.1| hypothetical protein 79.7 7e-14 >gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185] gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185] Length=573 Score = 175 bits (444), Expect = 1e-47, Method: Compositional matrix adjust. Identities = 97/223 (43%), Positives = 136/223 (61%), Gaps = 5/223 (2%) Query 3 SHRCQRVCGFDGSIDISAVENTNLSSD-EAIIRGKGIGGYRVNKPETFKTTEHGVLMCIY 61 S C+ + G+ ++DIS V NTNL+ D +A I+GKG G NK + F+++EHG++MCIY Sbjct 353 SGHCKYLGGWTSNLDISEVVNTNLTGDNQADIQGKGTGTLNGNKVD-FESSEHGIIMCIY 411 Query 62 HAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEEL-PSYSLLNTSDVQPIKEPRPF 120 H +PLLD++ Q T D + +PE DSVG ++L PS + D+ Sbjct 412 HCLPLLDWSINRIARQNFKTTFTD-YAIPEFDSVGMQQLYPSEMIFGLEDLPSDPSSINM 470 Query 121 GYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYMKIYFDNNNVPGGAHFGF-YTWFKV 179 GYVPRY KTS+D + G+FIDTL SW +P+ + Y+ Y G + Y +FKV Sbjct 471 GYVPRYADLKTSIDEIHGSFIDTLVSWVSPLTDSYISAYRQACKDAGFSDITMTYNFFKV 530 Query 180 NPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 NP +V+ IFGV AD + NTDQLL+N FD++ RN Y+GLPY Sbjct 531 NPHIVDNIFGVKADSTINTDQLLINSYFDIKAVRNFDYNGLPY 573 >gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium] Length=615 Score = 157 bits (397), Expect = 4e-41, Method: Compositional matrix adjust. Identities = 98/234 (42%), Positives = 137/234 (59%), Gaps = 25/234 (11%) Query 3 SHRCQRVCGFDGSIDISAVENTNLSSDEAI-IRGKGIGGYRVNKPETFKTT-EHGVLMCI 60 SH+ + + G S+DI+ V N N++ D A I GKG + N F++ E+G++MCI Sbjct 393 SHQARYLGGCATSLDINEVINNNITGDNAADIAGKGT--FTGNGSIRFESKGEYGIIMCI 450 Query 61 YHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRP- 119 YH +P++DY +G D T VD S+P+PELD +G E +P +N P+KE Sbjct 451 YHVLPIVDYVGSGVD-HSCTLVDATSFPIPELDQIGMESVPLVRAMN-----PVKESDTP 504 Query 120 -----FGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYM----KIYFDNN-NV-PGG 168 GY PRYI WKTSVD G F D+L++W P+G+ + + F +N NV P Sbjct 505 SADTFLGYAPRYIDWKTSVDRSVGDFADSLRTWCLPVGDKELTSANSLNFPSNPNVEPDS 564 Query 169 AHFGFYTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 GF FKVNPS+V+P+F VVAD + TD+ L + FDV+V RNL +GLPY Sbjct 565 IAAGF---FKVNPSIVDPLFAVVADSTVKTDEFLCSSFFDVKVVRNLDVNGLPY 615 >gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4] gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4] Length=580 Score = 141 bits (355), Expect = 2e-35, Method: Compositional matrix adjust. Identities = 93/220 (42%), Positives = 122/220 (55%), Gaps = 15/220 (7%) Query 11 GFDGSIDISAVENTNLS-SDEAIIRGKGI--GGYRVNKPETFKTTE-HGVLMCIYHAVPL 66 G S+DI+ V N N++ S+ A I GKG+ G R+ +F E +G++MCIYH++PL Sbjct 368 GTTASLDINEVVNNNITGSNAADIAGKGVVVGNGRI----SFDAGERYGLIMCIYHSLPL 423 Query 67 LDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRPFGYVPRY 126 LDY + F T ++ + +PE D VG E +P SL+N GY PRY Sbjct 424 LDYTTDLVNPAF-TKINSTDFAIPEFDRVGMESVPLVSLMNPLQSSYNVGSSILGYAPRY 482 Query 127 ISWKTSVDVVRGAFIDTLKSWTAPIGE----DYMKIYFDNNNVPGGAHFGFYTWFKVNPS 182 IS+KT VD GAF TLKSW + + D NN PG YT FKVNP+ Sbjct 483 ISYKTDVDSSVGAFKTTLKSWVMSYDNQSVINQLNYQDDPNNSPG--TLVNYTNFKVNPN 540 Query 183 VVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 V+P+F V A S +TDQ L + FDV+V RNL DGLPY Sbjct 541 CVDPLFAVAASNSIDTDQFLCSSFFDVKVVRNLDTDGLPY 580 >gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii] gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 20697] Length=578 Score = 139 bits (349), Expect = 1e-34, Method: Compositional matrix adjust. Identities = 92/232 (40%), Positives = 120/232 (52%), Gaps = 16/232 (7%) Query 3 SHRCQRVCGFDGSIDISAVENTNLS-SDEAIIRGKGIGGYRVNKPETFKTT-EHGVLMCI 60 S C + G SIDI+ V NTN++ S A I GKG+G N F + +G++MCI Sbjct 351 SELCTYLGGVSSSIDINEVINTNITGSAAADIAGKGVG--VANGEINFNSNGRYGLIMCI 408 Query 61 YHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRPF 120 YH +PLLDY D F+ V+ + +PE D VG + +P L+N Sbjct 409 YHCLPLLDYTTDMLDPAFLK-VNSTDYAIPEFDRVGMQSMPLVQLMNPLRSFANASGLVL 467 Query 121 GYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYM--KIYFDNNN--------VPGGAH 170 GYVPRYI +KTSVD G F TL SW G + ++ N+ VP A Sbjct 468 GYVPRYIDYKTSVDQSVGGFKRTLNSWVISYGNISVLKQVTLPNDAPPIEPSEPVPSVAP 527 Query 171 FGFYTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 F T+FKVNP ++PIF V A NTDQ L + FD++ RNL DGLPY Sbjct 528 MNF-TFFKVNPDCLDPIFAVQAGDDTNTDQFLCSSFFDIKAVRNLDTDGLPY 578 >gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius] gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 17135] Length=613 Score = 116 bits (291), Expect = 2e-26, Method: Compositional matrix adjust. Identities = 82/235 (35%), Positives = 124/235 (53%), Gaps = 24/235 (10%) Query 3 SHRCQRVCGFDGSIDISAVENTNLSSDEAI-IRGKGIGGYRVNKPETFKTT-EHGVLMCI 60 S CQ + + + I+ V N N++ + A I GKG N F ++G++MC+ Sbjct 388 SDMCQWLGSINIDLSINEVVNNNITGENAADIAGKGT--MSGNGSINFNVGGQYGIVMCV 445 Query 61 YHAVPLLDYAPTGPDLQFMTTVDGD-SWPVPELDSVGFEELPSYSLLNTSDVQP------ 113 +H +P LDY + P F TT+ +P+PE D +G E++P LN V+P Sbjct 446 FHVLPQLDYITSAP--HFGTTLTNVLDFPIPEFDKIGMEQVPVIRGLNP--VKPKDGDFK 501 Query 114 IKEPRPFGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYMKI-----YFDNNNVPGG 168 + FGY P+Y +WKT++D G F +LK+W P ++ + + DN NV Sbjct 502 VSPNLYFGYAPQYYNWKTTLDKSMGEFRRSLKTWIIPFDDEALLAADSVDFPDNPNVEAD 561 Query 169 A-HFGFYTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 + GF FKV+PSV++ +F V A+ NTDQ L + FDV V R+L +GLPY Sbjct 562 SVKAGF---FKVSPSVLDNLFAVKANSDLNTDQFLCSTLFDVNVVRSLDPNGLPY 613 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 89.0 bits (219), Expect = 7e-17, Method: Compositional matrix adjust. Identities = 66/233 (28%), Positives = 106/233 (45%), Gaps = 30/233 (13%) Query 5 RCQRVCGFDGSIDISAVENTNLSSDEAIIRGKGIGGY--RVNKPET--------FKTTEH 54 RC + GFD +I + V ++ ++ + GGY R T F EH Sbjct 348 RCTYIGGFDSNIQVGDVTQSSGTTVTGT-KDTSFGGYLGRTTGKATGSGSGHIRFDAKEH 406 Query 55 GVLMCIYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEEL----PSYSLLNTSD 110 G+LMCIY VP + Y D F+ ++ + VPE +++G + L SY N + Sbjct 407 GILMCIYSLVPDVQYDSKRVD-PFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTA 465 Query 111 VQPIKEPRPFGYVPRYISWKTSVDVVRGAFI--DTLKSWTAPIGEDYMKIYFDNNNVPGG 168 IK FG+ PRY +KT++D+ G F+ + L WT F+ Sbjct 466 NSRIKNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFN------- 518 Query 169 AHFGFYTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLP 221 + FK+NP ++ +F V +G+ TDQ+ C F++ ++S DG+P Sbjct 519 -----ISTFKINPKWLDDVFAVNYNGTELTDQVFGGCYFNIVKVSDMSIDGMP 566 >gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium] Length=642 Score = 85.1 bits (209), Expect = 2e-15, Method: Compositional matrix adjust. Identities = 68/229 (30%), Positives = 95/229 (41%), Gaps = 24/229 (10%) Query 9 VCGFDGSIDISAVENTNLSSDEAIIRG---KGIGGYRVNKPETFKTTEHGVLMCIYHAVP 65 + G I+I+ N NLS D G +G G + F +GV++ IY P Sbjct 421 IGGSSSMININEQINQNLSGDNKATYGAAPQGNGSASI----KFTAKTYGVVIGIYRCTP 476 Query 66 LLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEEL------------PSYSLLNTSDVQP 113 +LD+A G D T D + +PE+DS+G ++ + D Sbjct 477 VLDFAHLGIDRTLFKT-DASDFVIPEMDSIGMQQTFRCEVAAPAPYNDEFKAFRVGDGSS 535 Query 114 IKEPRPFGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYMKIYFDNNNVPGGAHFGF 173 +GY PRY +KTS D GAF +LKSW I D ++ NN A Sbjct 536 PDMSETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGINFDAIQ----NNVWNTWAGINA 591 Query 174 YTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLPY 222 F P +V +F V + + + DQL V RNLS GLPY Sbjct 592 PNMFACRPDIVKNLFLVSSTNNSDDDQLYVGMVNMCYATRNLSRYGLPY 640 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 82.8 bits (203), Expect = 7e-15, Method: Compositional matrix adjust. Identities = 67/232 (29%), Positives = 104/232 (45%), Gaps = 35/232 (15%) Query 5 RCQRVCGFDGSIDISAVENTNLSSDEAI---------IRGKGIGGYRVNKPETFKTTEHG 55 R + GFD + +S V T+ ++ I GKG G R F EHG Sbjct 306 RVNYLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGRGRI--VFDAKEHG 363 Query 56 VLMCIYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIK 115 VLMCIY VP + Y T D + +D + PE +++G + LN+S + Sbjct 364 VLMCIYSLVPQIQYDCTRLD-PMVDKLDRFDFFTPEFENLGMQP------LNSSYISSFC 416 Query 116 EPRP----FGYVPRYISWKTSVDVVRGAFI--DTLKSWTAPIGEDYMKIYFDNNNVPGGA 169 P P GY PRY +KT++D+ G F D L SW+ + F + Sbjct 417 TPDPKNPVLGYQPRYSEYKTALDINHGQFAQNDALSSWSVSRFRRWTT--FPQLEIAD-- 472 Query 170 HFGFYTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLP 221 FK++P +N +F V +G+ +TD + C+F++ ++S DG+P Sbjct 473 -------FKIDPGCLNSVFPVEFNGTESTDCVFGGCNFNIVKVSDMSVDGMP 517 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 82.4 bits (202), Expect = 1e-14, Method: Compositional matrix adjust. Identities = 64/220 (29%), Positives = 101/220 (46%), Gaps = 28/220 (13%) Query 9 VCGFDGSIDISAVENTNLSSDEAI---------IRGKGIGGYRVNKPETFKTTEHGVLMC 59 + GFD I IS V T+ S + + GKGIG ++ EHG++MC Sbjct 440 IGGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAMNSGHI-SYDVKEHGLIMC 498 Query 60 IYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSL---LNTSDVQPIKE 116 IY P +DY D F + + PE +++G + + L +N++ + Sbjct 499 IYSIAPQVDYDARELD-PFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQ 557 Query 117 PRP-FGYVPRYISWKTSVDVVRGAFID--TLKSWTAPIGEDYMKIYFDNNNVPGGAHFGF 173 GY RY+ +KT+ D++ G F+ +L +W P ++ F ++P Sbjct 558 HNNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATP--KNNYTFEFGKLSLPD------ 609 Query 174 YTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVAR 213 V+P V+ PIF V +GS +TDQ LVN FDV+ R Sbjct 610 ---LLVDPKVLEPIFAVKYNGSMSTDQFLVNSYFDVKAIR 646 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 79.7 bits (195), Expect = 7e-14, Method: Compositional matrix adjust. Identities = 67/228 (29%), Positives = 102/228 (45%), Gaps = 27/228 (12%) Query 5 RCQRVCGFDGSIDISAVENTNLSSDEAI---------IRGKGIGGYRVNKPETFKTTEHG 55 R + GFD + +S V T+ ++ + GKG G R F EHG Sbjct 340 RVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHG 397 Query 56 VLMCIYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIK 115 VLMCIY VP + Y T D + +D + PE +++G + L S + + P K Sbjct 398 VLMCIYSLVPQIQYDCTRLD-PMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDP-K 455 Query 116 EPRPFGYVPRYISWKTSVDVVRGAFI--DTLKSWTAPIGEDYMKIYFDNNNVPGGAHFGF 173 P GY PRY +KT++DV G F D L SW+ + F + Sbjct 456 NP-VLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTT--FPQLEIAD------ 506 Query 174 YTWFKVNPSVVNPIFGVVADGSWNTDQLLVNCDFDVRVARNLSYDGLP 221 FK++P +N IF V +G+ D + C+F++ ++S DG+P Sbjct 507 ---FKIDPGCLNSIFPVDYNGTEANDCVYGGCNFNIVKVSDMSVDGMP 551 Lambda K H a alpha 0.320 0.140 0.447 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 848296716240