BLASTP 2.2.24 [Aug-08-2010] Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. Query= Eten_5365_orf1 (131 letters) Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 14,777,732 sequences; 5,058,227,080 total letters
Score E Sequences producing significant alignments: (bits) Value gi|325116818|emb|CBZ52371.1| hypothetical protein NCLIV_021590 [... 116 1e-24 gi|221483953|gb|EEE22257.1| conserved hypothetical protein [Toxo... 115 2e-24 gi|237836641|ref|XP_002367618.1| hypothetical protein TGME49_003... 115 2e-24 gi|156099202|ref|XP_001615603.1| hypothetical protein [Plasmodiu... 78 3e-13 gi|221058853|ref|XP_002260072.1| hypothetical protein, conserved... 78 5e-13 gi|82915322|ref|XP_729049.1| hypothetical protein [Plasmodium yo... 77 1e-12 gi|68066917|ref|XP_675430.1| hypothetical protein [Plasmodium be... 76 1e-12 gi|296005430|ref|XP_002809037.1| conserved Plasmodium protein, u... 70 1e-10 gi|156083681|ref|XP_001609324.1| hypothetical protein [Babesia b... 67 1e-09 gi|85001067|ref|XP_955252.1| hypothetical protein [Theileria ann... 66 1e-09 gi|71027883|ref|XP_763585.1| hypothetical protein [Theileria par... 61 6e-08 gi|70954438|ref|XP_746266.1| hypothetical protein [Plasmodium ch... 58 5e-07 gi|310794771|gb|EFQ30232.1| transcription initiation factor IIE ... 35 3.2 gi|328947395|ref|YP_004364732.1| alpha amylase catalytic region ... 35 4.9 >gi|325116818|emb|CBZ52371.1| hypothetical protein NCLIV_021590 [Neospora caninum Liverpool] Length = 417 Score = 116 bits (290), Expect = 1e-24, Method: Compositional matrix adjust. Identities = 61/126 (48%), Positives = 82/126 (65%), Gaps = 5/126 (3%) Query: 3 LGPSGDS--TCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELP 60 LG + +S CD+Y +KC+NC +N+C L+LYP G+ E +R +D EVK LWD + LP Sbjct: 288 LGSAANSAAVCDIYAARKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLP 347 Query: 61 PLEAILLHFNSANLLTQQKPEAGTNN--QAKQRGQEKRQRTRHLNKFRRIQNTHLFTADE 118 PLE I+ + LT+Q A + Q K+R +EKR +R +KFRRI NTHLFTA+E Sbjct: 348 PLEDIIKECDPNRSLTRQTAAAAAQDKTQLKRRAEEKRV-SRFRSKFRRIHNTHLFTAEE 406 Query: 119 LRAFAN 124 LRAF N Sbjct: 407 LRAFTN 412 >gi|221483953|gb|EEE22257.1| conserved hypothetical protein [Toxoplasma gondii GT1] gi|221505235|gb|EEE30889.1| conserved hypothetical protein [Toxoplasma gondii VEG] Length = 607 Score = 115 bits (288), Expect = 2e-24, Method: Compositional matrix adjust. Identities = 57/122 (46%), Positives = 77/122 (63%), Gaps = 3/122 (2%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 + CD+Y +KC+NC +N+C L+LYP G+ E +R +D EVK LWD + LPPLE I+ Sbjct: 486 AVCDIYATRKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLPPLEDIIKQ 545 Query: 69 FNSANLLTQQKPEAGTN--NQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELRAFANGT 126 + + L++Q A Q K+R KR +R +KFRRI NTHLFTA+ELRAF N Sbjct: 546 CDPSRSLSRQNALAAAQEKTQLKRRADAKRV-SRFRSKFRRIHNTHLFTAEELRAFTNEQ 604 Query: 127 NP 128 P Sbjct: 605 AP 606 >gi|237836641|ref|XP_002367618.1| hypothetical protein TGME49_003360 [Toxoplasma gondii ME49] gi|211965282|gb|EEB00478.1| hypothetical protein TGME49_003360 [Toxoplasma gondii ME49] Length = 607 Score = 115 bits (288), Expect = 2e-24, Method: Compositional matrix adjust. Identities = 57/122 (46%), Positives = 77/122 (63%), Gaps = 3/122 (2%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 + CD+Y +KC+NC +N+C L+LYP G+ E +R +D EVK LWD + LPPLE I+ Sbjct: 486 AVCDIYATRKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLPPLEDIIKQ 545 Query: 69 FNSANLLTQQKPEAGTN--NQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELRAFANGT 126 + + L++Q A Q K+R KR +R +KFRRI NTHLFTA+ELRAF N Sbjct: 546 CDPSRSLSRQNALAAAQEKTQLKRRADAKRV-SRFRSKFRRIHNTHLFTAEELRAFTNEQ 604 Query: 127 NP 128 P Sbjct: 605 AP 606 >gi|156099202|ref|XP_001615603.1| hypothetical protein [Plasmodium vivax SaI-1] gi|148804477|gb|EDL45876.1| hypothetical protein, conserved [Plasmodium vivax] Length = 548 Score = 78.2 bits (191), Expect = 3e-13, Method: Composition-based stats. Identities = 39/112 (34%), Positives = 63/112 (56%), Gaps = 5/112 (4%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 S CD+Y K KC NC+ N+ I +P E R+ I ++K LWDE+ LP L++IL Sbjct: 435 SFCDIYAKNKCDNCFYNLKGYIFFPLSYEHIEQQRYSITNDIKNLWDEITLPNLDSILKE 494 Query: 69 FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120 + + + ++ A +R ++ + + + K +RI NTHLFTADE++ Sbjct: 495 YK-----LKSTNKVFVHDNAPKRKNKEDKFSPVVKKMKRIYNTHLFTADEIK 541 >gi|221058853|ref|XP_002260072.1| hypothetical protein, conserved in Plasmodium species [Plasmodium knowlesi strain H] gi|193810145|emb|CAQ41339.1| hypothetical protein, conserved in Plasmodium species [Plasmodium knowlesi strain H] Length = 551 Score = 77.8 bits (190), Expect = 5e-13, Method: Composition-based stats. Identities = 40/112 (35%), Positives = 61/112 (54%), Gaps = 5/112 (4%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 S CD+Y K KC NC+ N+ I +P E R+ I ++K LWDE+ LP L+ IL Sbjct: 438 SFCDIYAKNKCENCFYNLKGYIFFPLSYEHIEQQRYSITNDIKNLWDEITLPNLDDILKE 497 Query: 69 FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120 + + + N+ A +R ++ + + K +RI NTHLFTADE++ Sbjct: 498 YK-----LKSTNKVFVNDNAPKRKNKEDKFAPVVKKMKRIYNTHLFTADEIK 544 >gi|82915322|ref|XP_729049.1| hypothetical protein [Plasmodium yoelii yoelii str. 17XNL] gi|23485871|gb|EAA20614.1| hypothetical protein [Plasmodium yoelii yoelii] Length = 2329 Score = 76.6 bits (187), Expect = 1e-12, Method: Composition-based stats. Identities = 39/110 (35%), Positives = 61/110 (55%), Gaps = 5/110 (4%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70 CD+Y K KC NC+ N+ I +P E+ R+ I ++K LWD + LP L+ IL + Sbjct: 2218 CDIYSKNKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNINLPNLDNILKEYK 2277 Query: 71 SANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120 + + +N+ A +R + + T + K +RI NTHLFTADE++ Sbjct: 2278 -----LKSTNKIFSNDNAPKRKNKDDKFTPIIKKMKRIYNTHLFTADEIK 2322 >gi|68066917|ref|XP_675430.1| hypothetical protein [Plasmodium berghei strain ANKA] gi|56494612|emb|CAH93580.1| hypothetical protein PB100065.00.0 [Plasmodium berghei] Length = 548 Score = 76.3 bits (186), Expect = 1e-12, Method: Composition-based stats. Identities = 39/110 (35%), Positives = 61/110 (55%), Gaps = 5/110 (4%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70 CD+Y K KC NC+ N+ I +P E+ R+ I ++K LWD + LP L+ IL + Sbjct: 437 CDIYSKSKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNITLPNLDNILKEYK 496 Query: 71 SANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120 + + +N+ A +R + + T + K +RI NTHLFTADE++ Sbjct: 497 -----LKSTNKIFSNDNAPKRKNKDDKFTPIIKKMKRIYNTHLFTADEIK 541 >gi|296005430|ref|XP_002809037.1| conserved Plasmodium protein, unknown function [Plasmodium falciparum 3D7] gi|225631979|emb|CAX64318.1| conserved Plasmodium protein, unknown function [Plasmodium falciparum 3D7] Length = 542 Score = 70.1 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 37/112 (33%), Positives = 58/112 (51%), Gaps = 5/112 (4%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 + CD+Y K KC NC+ N+ + +P E R+ I ++K LWDE+ LP L+ IL Sbjct: 429 NVCDIYAKTKCDNCFYNLKGYLFFPLSYEHIEKERYSITNDMKMLWDEISLPSLDNILKE 488 Query: 69 FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120 + + N ++ + + + L K +RI NTHLFTADE++ Sbjct: 489 YK-----LKSTNNVFMNENLPKKKNKDDKLSPILKKMKRIYNTHLFTADEIK 535 >gi|156083681|ref|XP_001609324.1| hypothetical protein [Babesia bovis T2Bo] gi|154796575|gb|EDO05756.1| hypothetical protein BBOV_IV001590 [Babesia bovis] Length = 315 Score = 66.6 bits (161), Expect = 1e-09, Method: Compositional matrix adjust. Identities = 42/124 (33%), Positives = 57/124 (45%), Gaps = 18/124 (14%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70 C LY KC C N ++L+P G E R ++K LWD V LPP+E +L +N Sbjct: 189 CSLYTVTKCVECSTNFKDVLLFPLGKEQYEQDRLNFASDIKDLWDSVTLPPIEDLLKEYN 248 Query: 71 SANL-------------LTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTAD 117 +++ L +GT R Q +Q + K R+I NTHLFTA Sbjct: 249 MSHVERTFINVPRWVITLLWITDCSGTGG----RRQRNKQGIAGI-KMRKIYNTHLFTAQ 303 Query: 118 ELRA 121 EL A Sbjct: 304 ELSA 307 >gi|85001067|ref|XP_955252.1| hypothetical protein [Theileria annulata strain Ankara] gi|65303398|emb|CAI75776.1| hypothetical protein, conserved [Theileria annulata] Length = 1365 Score = 66.2 bits (160), Expect = 1e-09, Method: Composition-based stats. Identities = 30/87 (34%), Positives = 48/87 (55%), Gaps = 1/87 (1%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70 C +Y KC CY N+ LIL+P G E RFK+D ++K LWD V +P ++ +L +N Sbjct: 1201 CGVYSSSKCQECYSNVEGLILFPLGKDQFEHDRFKLDHDIKNLWDSVAIPSMDQLLRDYN 1260 Query: 71 -SANLLTQQKPEAGTNNQAKQRGQEKR 96 S ++T Q + + + +G E + Sbjct: 1261 ISQTVITFQPVQTEKKKRKEAKGYESK 1287 >gi|71027883|ref|XP_763585.1| hypothetical protein [Theileria parva strain Muguga] gi|68350538|gb|EAN31302.1| hypothetical protein TP03_0557 [Theileria parva] Length = 447 Score = 60.8 bits (146), Expect = 6e-08, Method: Composition-based stats. Identities = 26/62 (41%), Positives = 36/62 (58%) Query: 9 STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68 S C +Y KC C+ N+ L+L+P G E RFKID +VK LWD V +P + +L Sbjct: 323 SQCGIYSSSKCQECHSNLEGLVLFPLGKDQFERDRFKIDHDVKNLWDAVVIPSTDQLLRE 382 Query: 69 FN 70 +N Sbjct: 383 YN 384 >gi|70954438|ref|XP_746266.1| hypothetical protein [Plasmodium chabaudi chabaudi] gi|56526815|emb|CAH77114.1| hypothetical protein PC103304.00.0 [Plasmodium chabaudi chabaudi] Length = 207 Score = 57.8 bits (138), Expect = 5e-07, Method: Compositional matrix adjust. Identities = 23/56 (41%), Positives = 33/56 (58%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAIL 66 CD+Y K KC NC+ N+ I +P E+ R+ I ++K LWD + LP L+ IL Sbjct: 113 CDIYSKSKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNITLPNLDNIL 168 >gi|310794771|gb|EFQ30232.1| transcription initiation factor IIE subunit beta [Glomerella graminicola M1.001] Length = 295 Score = 35.4 bits (80), Expect = 3.2, Method: Compositional matrix adjust. Identities = 19/69 (27%), Positives = 37/69 (53%), Gaps = 4/69 (5%) Query: 45 KIDKEVKQLWDEVELPPLEAILLHFNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNK 104 ++D E K++W+ VE+P + I+ S QKP + + K+ +K+Q+ R + + Sbjct: 218 QVDDEFKRMWNAVEVPTTDDIVKKLVSVG----QKPASADPSTIKKVDSKKQQKKRAVRR 273 Query: 105 FRRIQNTHL 113 + NTH+ Sbjct: 274 TGKTTNTHM 282 >gi|328947395|ref|YP_004364732.1| alpha amylase catalytic region [Treponema succinifaciens DSM 2489] gi|328447719|gb|AEB13435.1| alpha amylase catalytic region [Treponema succinifaciens DSM 2489] Length = 1214 Score = 34.7 bits (78), Expect = 4.9, Method: Composition-based stats. Identities = 23/83 (27%), Positives = 36/83 (43%), Gaps = 9/83 (10%) Query: 11 CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70 C +Y + K +C+KN+ D S+ KIDK + Q + E PP+ + Sbjct: 79 CMIYRRDKAPDCFKNL-------LSDLDRNFSKVKIDKLLLQFME--EFPPVNVFKKEIS 129 Query: 71 SANLLTQQKPEAGTNNQAKQRGQ 93 + L Q +AGT + R Q Sbjct: 130 AEEFLEQNCIDAGTKMKRSNREQ 152 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Jul 22, 2011 4:42 PM Number of letters in database: 5,058,227,080 Number of sequences in database: 14,777,732 Lambda K H 0.317 0.132 0.398 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 14777732 Number of Hits to DB: 1,346,317,846 Number of extensions: 48654432 Number of successful extensions: 120826 Number of sequences better than 10.0: 27 Number of HSP's gapped: 121942 Number of HSP's successfully gapped: 27 Length of query: 131 Length of database: 5,058,227,080 Length adjustment: 96 Effective length of query: 35 Effective length of database: 3,639,564,808 Effective search space: 127384768280 Effective search space used: 127384768280 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 76 (33.9 bits)