bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-37_CDS_annotation_glimmer3.pl_2_1 Length=251 Score E Sequences producing significant alignments: (Bits) Value gi|492501782|ref|WP_005867318.1| hypothetical protein 77.4 6e-13 gi|649557305|gb|KDS63784.1| capsid family protein 74.7 7e-13 gi|547920049|ref|WP_022322420.1| capsid protein VP1 75.5 3e-12 gi|649569140|gb|KDS75238.1| capsid family protein 74.7 3e-12 gi|649555287|gb|KDS61824.1| capsid family protein 73.9 7e-12 gi|609718276|emb|CDN73650.1| conserved hypothetical protein 70.1 2e-10 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 67.4 8e-10 gi|639237429|ref|WP_024568106.1| hypothetical protein 66.2 3e-09 gi|599087551|gb|AHN52701.1| major capsid protein 55.1 5e-06 gi|565841287|ref|WP_023924568.1| hypothetical protein 55.8 9e-06 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 77.4 bits (189), Expect = 6e-13, Method: Compositional matrix adjust. Identities = 71/244 (29%), Positives = 110/244 (45%), Gaps = 22/244 (9%) Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69 R A SG Y + I + V ++D R + P + GG + I EV+ SAT++ P Sbjct 311 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSATDSTSPQ 368 Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127 +AG G + G+ G K +E YIIGI+SI PR Y QG D D D + P Sbjct 369 ANMAGHGISAGVNHG-FKRYFEEHGYIIGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 425 Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187 +G Q++ ++ + +G + G P + +Y + N+V G+F + + Sbjct 426 EFAHLGEQEIKNEEVYLQQTPASNNG-----TFGYTPRYAEYKYSMNEVHGDF--RGNMA 478 Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247 F LNR + N + TT+++ N VFA + +W+QL K R M Sbjct 479 FWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYG 534 Query 248 IPNL 251 P L Sbjct 535 TPML 538 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 74.7 bits (182), Expect = 7e-13, Method: Compositional matrix adjust. Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%) Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69 R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P Sbjct 18 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 75 Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127 +AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P Sbjct 76 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 132 Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187 +G Q++ ++ + +G + G P + +Y + N+V G+F + + Sbjct 133 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 185 Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247 F LNR ++ N + TT+++ N VFA + +WVQ+ K R M Sbjct 186 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 241 Query 248 IPNL 251 P L Sbjct 242 TPML 245 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 75.5 bits (184), Expect = 3e-12, Method: Compositional matrix adjust. Identities = 62/214 (29%), Positives = 96/214 (45%), Gaps = 13/214 (6%) Query 39 RSETPVYEGGFSSEIIFQEVISNSAT-ENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYII 97 R + P + GG I EV+ S+T E P +AG G + G+ G K +E YII Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYII 410 Query 98 GIVSITPRIDYSQGNRFDVDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQL 157 GI+SITPR Y QG D D + P + Q++ ++ ++ D Sbjct 411 GIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQEL-----FVSEDAAYNN 465 Query 158 KSVGKQPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNY 217 + G P + +Y + ++ G+F + + F LNR +E N + TT+++ + N Sbjct 466 GTFGYTPRYAEYKYHPSEAHGDF--RGNLSFWHLNRIFEDKPNLN----TTFVECKPSNR 519 Query 218 VFADTSLNAMNFWVQLGIGAKVRRKMSAKVIPNL 251 VFA + FWVQ+ K R M P L Sbjct 520 VFATSETEDDKFWVQMYQDVKALRLMPKYGTPML 553 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 74.7 bits (182), Expect = 3e-12, Method: Compositional matrix adjust. Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%) Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69 R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P Sbjct 163 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 220 Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127 +AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P Sbjct 221 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 277 Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187 +G Q++ ++ + +G + G P + +Y + N+V G+F + + Sbjct 278 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 330 Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247 F LNR ++ N + TT+++ N VFA + +WVQ+ K R M Sbjct 331 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 386 Query 248 IPNL 251 P L Sbjct 387 TPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 73.9 bits (180), Expect = 7e-12, Method: Compositional matrix adjust. Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%) Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69 R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P Sbjct 314 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 371 Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127 +AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P Sbjct 372 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 428 Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187 +G Q++ ++ + +G + G P + +Y + N+V G+F + + Sbjct 429 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 481 Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247 F LNR ++ N + TT+++ N VFA + +WVQ+ K R M Sbjct 482 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 537 Query 248 IPNL 251 P L Sbjct 538 TPML 541 >gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis] Length=537 Score = 70.1 bits (170), Expect = 2e-10, Method: Compositional matrix adjust. Identities = 71/251 (28%), Positives = 111/251 (44%), Gaps = 20/251 (8%) Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGGFSSEIIFQE 57 ++ L A K+ + L + A +G Y + I + + T+D R + P + GG S I+ E Sbjct 291 VNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGGNKSPIMISE 348 Query 58 VISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDV 116 V+ SAT++ P G +AG G G GG + +E Y+IG++S+ P+ YSQG Sbjct 349 VLQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRHF 407 Query 117 DLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKV 176 D P + IG Q + NK + D E G P + +Y + + V Sbjct 408 SKSDKFDYFWPQFEHIGEQPV-YNKEIFAKNIDAFDSEAVF---GYLPRYSEYKFSPSTV 463 Query 177 FGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFA---DTSLNAMNFWVQL 233 G+F KD F L R ++ D+ + D + +FA DT F+ L Sbjct 464 HGDF--KDDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVEDDTD----KFYCHL 517 Query 234 GIGAKVRRKMS 244 +RKMS Sbjct 518 YQKITAKRKMS 528 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 67.4 bits (163), Expect = 8e-10, Method: Compositional matrix adjust. Identities = 72/286 (25%), Positives = 113/286 (40%), Gaps = 44/286 (15%) Query 4 LNLAKKVYDMLNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQEVISNS 62 L L K+ + ++R+ VSGG D +T++ T P + G ++Q I+ S Sbjct 57 LRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPS 110 Query 63 ATENEPLGTLAGRGQNTGMKGGTVKIKID------------EPSYIIGIVSITPRIDYSQ 110 G+ +G N G V D EP + I + P YSQ Sbjct 111 NVRAMANGSASGEDANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQ 170 Query 111 GNRFDVDLDTLDDLHKPALDAIGFQDLTTNKMA-----------------WWDETITAD- 152 G D+ + D P L+ IGFQ + ++ + W+ T T Sbjct 171 GLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVL 230 Query 153 GEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRN---YEMDENKSIAD---- 205 + + SVG++ AW T+Y+++ G+FA + + L R Y D+ Sbjct 231 VDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEY 290 Query 206 LTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKVIPNL 251 TYI+P + YVF D +L A NF V +SA +P L Sbjct 291 TGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336 >gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis] Length=546 Score = 66.2 bits (160), Expect = 3e-09, Method: Compositional matrix adjust. Identities = 67/252 (27%), Positives = 116/252 (46%), Gaps = 22/252 (9%) Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGGFSSEIIFQE 57 ++ L A K+ + L + A +G Y + I + + T+D R + P + GG + I+ E Sbjct 300 INDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGGNKTPILISE 357 Query 58 VISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQG-NRFD 115 V+ S+T++ P G +AG G + G +GG K +E Y+IG++S+ P+ YSQG R Sbjct 358 VLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHF 416 Query 116 VDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKS---VGKQPAWLDYMTN 172 D D P + IG Q + +++ I A S G P + +Y + Sbjct 417 SKFDKFDYFW-PQFEHIGEQPV-------YNKEIFAKNVGDYDSGGVFGYVPRYSEYKYS 468 Query 173 YNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQ 232 + + G+F KD+ F L R ++ + ++ + +FA N+ F+ Sbjct 469 PSTIHGDF--KDTLYFWHLGRIFDSSAPPKLNRDFIEVNKSGLSRIFA-VEDNSDKFYCH 525 Query 233 LGIGAKVRRKMS 244 L +RKMS Sbjct 526 LYQKITAKRKMS 537 >gi|599087551|gb|AHN52701.1| major capsid protein, partial [uncultured Gokushovirinae] Length=220 Score = 55.1 bits (131), Expect = 5e-06, Method: Compositional matrix adjust. Identities = 44/136 (32%), Positives = 62/136 (46%), Gaps = 2/136 (1%) Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQEVI 59 ++ L A ++ +L R A G Y + IQ + R + P Y GG ++ II +V Sbjct 78 INQLRQAFQIQKLLERDARGGTRYTEIIQAHFGVTSPDARLQRPEYLGGGTTPIIISQVP 137 Query 60 SNSATENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLD 119 S ++ P GTLA G T K G K E IIG+ S+ + Y QG Sbjct 138 QTSESDGTPQGTLAAYGTATMRKAGFTK-SFTEHCVIIGLASVRADLTYQQGLERMWSRQ 196 Query 120 TLDDLHKPALDAIGFQ 135 T D++ PAL IG Q Sbjct 197 TRYDVYWPALAMIGEQ 212 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 55.8 bits (133), Expect = 9e-06, Method: Compositional matrix adjust. Identities = 63/248 (25%), Positives = 107/248 (43%), Gaps = 21/248 (8%) Query 13 MLNRI-AVSGGTYQDWIQTVYTNDYIE-RSETPVYEGGFSSEIIFQEVISNS-------A 63 ML R A +G Y + I + E R + GGF ++I EV++ S A Sbjct 405 MLERTRAANGLDYSNQIAAHFGFKVPESRKNCASFIGGFDNQISISEVVTTSNGSVDGTA 464 Query 64 TENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDY--SQGNRFDVDLDTL 121 + +G + G+G M G + + E I+ I SI P++DY + + F+ + Sbjct 465 STGSVVGQVFGKGIG-AMNSGHISYDVKEHGLIMCIYSIAPQVDYDARELDPFNRKF-SR 522 Query 122 DDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSV-GKQPAWLDYMTNYNKVFGNF 180 +D +P + +G Q + + + + +D Q +V G +L+Y T + +FG F Sbjct 523 EDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNNVLGYSARYLEYKTARDIIFGEF 582 Query 181 AIKDS-EMFMTLNRNYEMDENK-SIADLTTYIDPEKYNYVFA---DTSLNAMNFWVQLGI 235 S + T NY + K S+ DL +DP+ +FA + S++ F V Sbjct 583 MSGGSLSAWATPKNNYTFEFGKLSLPDLL--VDPKVLEPIFAVKYNGSMSTDQFLVNSYF 640 Query 236 GAKVRRKM 243 K R M Sbjct 641 DVKAIRPM 648 Lambda K H a alpha 0.316 0.134 0.392 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 1124108458389