BLASTP 2.2.24 [Aug-08-2010] 

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Eten_5365_orf1
         (131 letters)

Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
from WGS projects 
           14,777,732 sequences; 5,058,227,080 total letters



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

gi|325116818|emb|CBZ52371.1| hypothetical protein NCLIV_021590 [...   116     1e-24
gi|221483953|gb|EEE22257.1| conserved hypothetical protein [Toxo...   115     2e-24
gi|237836641|ref|XP_002367618.1| hypothetical protein TGME49_003...   115     2e-24
gi|156099202|ref|XP_001615603.1| hypothetical protein [Plasmodiu...    78     3e-13
gi|221058853|ref|XP_002260072.1| hypothetical protein, conserved...    78     5e-13
gi|82915322|ref|XP_729049.1| hypothetical protein [Plasmodium yo...    77     1e-12
gi|68066917|ref|XP_675430.1| hypothetical protein [Plasmodium be...    76     1e-12
gi|296005430|ref|XP_002809037.1| conserved Plasmodium protein, u...    70     1e-10
gi|156083681|ref|XP_001609324.1| hypothetical protein [Babesia b...    67     1e-09
gi|85001067|ref|XP_955252.1| hypothetical protein [Theileria ann...    66     1e-09
gi|71027883|ref|XP_763585.1| hypothetical protein [Theileria par...    61     6e-08
gi|70954438|ref|XP_746266.1| hypothetical protein [Plasmodium ch...    58     5e-07
gi|310794771|gb|EFQ30232.1| transcription initiation factor IIE ...    35     3.2  
gi|328947395|ref|YP_004364732.1| alpha amylase catalytic region ...    35     4.9  

>gi|325116818|emb|CBZ52371.1| hypothetical protein NCLIV_021590 [Neospora caninum Liverpool]
          Length = 417

 Score =  116 bits (290), Expect = 1e-24,   Method: Compositional matrix adjust.
 Identities = 61/126 (48%), Positives = 82/126 (65%), Gaps = 5/126 (3%)

Query: 3   LGPSGDS--TCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELP 60
           LG + +S   CD+Y  +KC+NC +N+C L+LYP G+   E +R  +D EVK LWD + LP
Sbjct: 288 LGSAANSAAVCDIYAARKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLP 347

Query: 61  PLEAILLHFNSANLLTQQKPEAGTNN--QAKQRGQEKRQRTRHLNKFRRIQNTHLFTADE 118
           PLE I+   +    LT+Q   A   +  Q K+R +EKR  +R  +KFRRI NTHLFTA+E
Sbjct: 348 PLEDIIKECDPNRSLTRQTAAAAAQDKTQLKRRAEEKRV-SRFRSKFRRIHNTHLFTAEE 406

Query: 119 LRAFAN 124
           LRAF N
Sbjct: 407 LRAFTN 412


>gi|221483953|gb|EEE22257.1| conserved hypothetical protein [Toxoplasma gondii GT1]
 gi|221505235|gb|EEE30889.1| conserved hypothetical protein [Toxoplasma gondii VEG]
          Length = 607

 Score =  115 bits (288), Expect = 2e-24,   Method: Compositional matrix adjust.
 Identities = 57/122 (46%), Positives = 77/122 (63%), Gaps = 3/122 (2%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           + CD+Y  +KC+NC +N+C L+LYP G+   E +R  +D EVK LWD + LPPLE I+  
Sbjct: 486 AVCDIYATRKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLPPLEDIIKQ 545

Query: 69  FNSANLLTQQKPEAGTN--NQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELRAFANGT 126
            + +  L++Q   A      Q K+R   KR  +R  +KFRRI NTHLFTA+ELRAF N  
Sbjct: 546 CDPSRSLSRQNALAAAQEKTQLKRRADAKRV-SRFRSKFRRIHNTHLFTAEELRAFTNEQ 604

Query: 127 NP 128
            P
Sbjct: 605 AP 606


>gi|237836641|ref|XP_002367618.1| hypothetical protein TGME49_003360 [Toxoplasma gondii ME49]
 gi|211965282|gb|EEB00478.1| hypothetical protein TGME49_003360 [Toxoplasma gondii ME49]
          Length = 607

 Score =  115 bits (288), Expect = 2e-24,   Method: Compositional matrix adjust.
 Identities = 57/122 (46%), Positives = 77/122 (63%), Gaps = 3/122 (2%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           + CD+Y  +KC+NC +N+C L+LYP G+   E +R  +D EVK LWD + LPPLE I+  
Sbjct: 486 AVCDIYATRKCANCRQNLCGLLLYPLGEEIYEQARMDLDAEVKGLWDSIVLPPLEDIIKQ 545

Query: 69  FNSANLLTQQKPEAGTN--NQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELRAFANGT 126
            + +  L++Q   A      Q K+R   KR  +R  +KFRRI NTHLFTA+ELRAF N  
Sbjct: 546 CDPSRSLSRQNALAAAQEKTQLKRRADAKRV-SRFRSKFRRIHNTHLFTAEELRAFTNEQ 604

Query: 127 NP 128
            P
Sbjct: 605 AP 606


>gi|156099202|ref|XP_001615603.1| hypothetical protein [Plasmodium vivax SaI-1]
 gi|148804477|gb|EDL45876.1| hypothetical protein, conserved [Plasmodium vivax]
          Length = 548

 Score = 78.2 bits (191), Expect = 3e-13,   Method: Composition-based stats.
 Identities = 39/112 (34%), Positives = 63/112 (56%), Gaps = 5/112 (4%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           S CD+Y K KC NC+ N+   I +P      E  R+ I  ++K LWDE+ LP L++IL  
Sbjct: 435 SFCDIYAKNKCDNCFYNLKGYIFFPLSYEHIEQQRYSITNDIKNLWDEITLPNLDSILKE 494

Query: 69  FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120
           +       +   +   ++ A +R  ++ + +  + K +RI NTHLFTADE++
Sbjct: 495 YK-----LKSTNKVFVHDNAPKRKNKEDKFSPVVKKMKRIYNTHLFTADEIK 541


>gi|221058853|ref|XP_002260072.1| hypothetical protein, conserved in Plasmodium species [Plasmodium
           knowlesi strain H]
 gi|193810145|emb|CAQ41339.1| hypothetical protein, conserved in Plasmodium species [Plasmodium
           knowlesi strain H]
          Length = 551

 Score = 77.8 bits (190), Expect = 5e-13,   Method: Composition-based stats.
 Identities = 40/112 (35%), Positives = 61/112 (54%), Gaps = 5/112 (4%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           S CD+Y K KC NC+ N+   I +P      E  R+ I  ++K LWDE+ LP L+ IL  
Sbjct: 438 SFCDIYAKNKCENCFYNLKGYIFFPLSYEHIEQQRYSITNDIKNLWDEITLPNLDDILKE 497

Query: 69  FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120
           +       +   +   N+ A +R  ++ +    + K +RI NTHLFTADE++
Sbjct: 498 YK-----LKSTNKVFVNDNAPKRKNKEDKFAPVVKKMKRIYNTHLFTADEIK 544


>gi|82915322|ref|XP_729049.1| hypothetical protein [Plasmodium yoelii yoelii str. 17XNL]
 gi|23485871|gb|EAA20614.1| hypothetical protein [Plasmodium yoelii yoelii]
          Length = 2329

 Score = 76.6 bits (187), Expect = 1e-12,   Method: Composition-based stats.
 Identities = 39/110 (35%), Positives = 61/110 (55%), Gaps = 5/110 (4%)

Query: 11   CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70
            CD+Y K KC NC+ N+   I +P      E+ R+ I  ++K LWD + LP L+ IL  + 
Sbjct: 2218 CDIYSKNKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNINLPNLDNILKEYK 2277

Query: 71   SANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120
                  +   +  +N+ A +R  +  + T  + K +RI NTHLFTADE++
Sbjct: 2278 -----LKSTNKIFSNDNAPKRKNKDDKFTPIIKKMKRIYNTHLFTADEIK 2322


>gi|68066917|ref|XP_675430.1| hypothetical protein [Plasmodium berghei strain ANKA]
 gi|56494612|emb|CAH93580.1| hypothetical protein PB100065.00.0 [Plasmodium berghei]
          Length = 548

 Score = 76.3 bits (186), Expect = 1e-12,   Method: Composition-based stats.
 Identities = 39/110 (35%), Positives = 61/110 (55%), Gaps = 5/110 (4%)

Query: 11  CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70
           CD+Y K KC NC+ N+   I +P      E+ R+ I  ++K LWD + LP L+ IL  + 
Sbjct: 437 CDIYSKSKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNITLPNLDNILKEYK 496

Query: 71  SANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120
                 +   +  +N+ A +R  +  + T  + K +RI NTHLFTADE++
Sbjct: 497 -----LKSTNKIFSNDNAPKRKNKDDKFTPIIKKMKRIYNTHLFTADEIK 541


>gi|296005430|ref|XP_002809037.1| conserved Plasmodium protein, unknown function [Plasmodium
           falciparum 3D7]
 gi|225631979|emb|CAX64318.1| conserved Plasmodium protein, unknown function [Plasmodium
           falciparum 3D7]
          Length = 542

 Score = 70.1 bits (170), Expect = 1e-10,   Method: Composition-based stats.
 Identities = 37/112 (33%), Positives = 58/112 (51%), Gaps = 5/112 (4%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           + CD+Y K KC NC+ N+   + +P      E  R+ I  ++K LWDE+ LP L+ IL  
Sbjct: 429 NVCDIYAKTKCDNCFYNLKGYLFFPLSYEHIEKERYSITNDMKMLWDEISLPSLDNILKE 488

Query: 69  FNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTADELR 120
           +       +       N    ++  +  + +  L K +RI NTHLFTADE++
Sbjct: 489 YK-----LKSTNNVFMNENLPKKKNKDDKLSPILKKMKRIYNTHLFTADEIK 535


>gi|156083681|ref|XP_001609324.1| hypothetical protein [Babesia bovis T2Bo]
 gi|154796575|gb|EDO05756.1| hypothetical protein BBOV_IV001590 [Babesia bovis]
          Length = 315

 Score = 66.6 bits (161), Expect = 1e-09,   Method: Compositional matrix adjust.
 Identities = 42/124 (33%), Positives = 57/124 (45%), Gaps = 18/124 (14%)

Query: 11  CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70
           C LY   KC  C  N   ++L+P G    E  R     ++K LWD V LPP+E +L  +N
Sbjct: 189 CSLYTVTKCVECSTNFKDVLLFPLGKEQYEQDRLNFASDIKDLWDSVTLPPIEDLLKEYN 248

Query: 71  SANL-------------LTQQKPEAGTNNQAKQRGQEKRQRTRHLNKFRRIQNTHLFTAD 117
            +++             L      +GT      R Q  +Q    + K R+I NTHLFTA 
Sbjct: 249 MSHVERTFINVPRWVITLLWITDCSGTGG----RRQRNKQGIAGI-KMRKIYNTHLFTAQ 303

Query: 118 ELRA 121
           EL A
Sbjct: 304 ELSA 307


>gi|85001067|ref|XP_955252.1| hypothetical protein [Theileria annulata strain Ankara]
 gi|65303398|emb|CAI75776.1| hypothetical protein, conserved [Theileria annulata]
          Length = 1365

 Score = 66.2 bits (160), Expect = 1e-09,   Method: Composition-based stats.
 Identities = 30/87 (34%), Positives = 48/87 (55%), Gaps = 1/87 (1%)

Query: 11   CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70
            C +Y   KC  CY N+  LIL+P G    E  RFK+D ++K LWD V +P ++ +L  +N
Sbjct: 1201 CGVYSSSKCQECYSNVEGLILFPLGKDQFEHDRFKLDHDIKNLWDSVAIPSMDQLLRDYN 1260

Query: 71   -SANLLTQQKPEAGTNNQAKQRGQEKR 96
             S  ++T Q  +     + + +G E +
Sbjct: 1261 ISQTVITFQPVQTEKKKRKEAKGYESK 1287


>gi|71027883|ref|XP_763585.1| hypothetical protein [Theileria parva strain Muguga]
 gi|68350538|gb|EAN31302.1| hypothetical protein TP03_0557 [Theileria parva]
          Length = 447

 Score = 60.8 bits (146), Expect = 6e-08,   Method: Composition-based stats.
 Identities = 26/62 (41%), Positives = 36/62 (58%)

Query: 9   STCDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLH 68
           S C +Y   KC  C+ N+  L+L+P G    E  RFKID +VK LWD V +P  + +L  
Sbjct: 323 SQCGIYSSSKCQECHSNLEGLVLFPLGKDQFERDRFKIDHDVKNLWDAVVIPSTDQLLRE 382

Query: 69  FN 70
           +N
Sbjct: 383 YN 384


>gi|70954438|ref|XP_746266.1| hypothetical protein [Plasmodium chabaudi chabaudi]
 gi|56526815|emb|CAH77114.1| hypothetical protein PC103304.00.0 [Plasmodium chabaudi chabaudi]
          Length = 207

 Score = 57.8 bits (138), Expect = 5e-07,   Method: Compositional matrix adjust.
 Identities = 23/56 (41%), Positives = 33/56 (58%)

Query: 11  CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAIL 66
           CD+Y K KC NC+ N+   I +P      E+ R+ I  ++K LWD + LP L+ IL
Sbjct: 113 CDIYSKSKCDNCFYNLKGYIFFPLSYQHIENERYSITNDIKNLWDNITLPNLDNIL 168


>gi|310794771|gb|EFQ30232.1| transcription initiation factor IIE subunit beta [Glomerella
           graminicola M1.001]
          Length = 295

 Score = 35.4 bits (80), Expect = 3.2,   Method: Compositional matrix adjust.
 Identities = 19/69 (27%), Positives = 37/69 (53%), Gaps = 4/69 (5%)

Query: 45  KIDKEVKQLWDEVELPPLEAILLHFNSANLLTQQKPEAGTNNQAKQRGQEKRQRTRHLNK 104
           ++D E K++W+ VE+P  + I+    S      QKP +   +  K+   +K+Q+ R + +
Sbjct: 218 QVDDEFKRMWNAVEVPTTDDIVKKLVSVG----QKPASADPSTIKKVDSKKQQKKRAVRR 273

Query: 105 FRRIQNTHL 113
             +  NTH+
Sbjct: 274 TGKTTNTHM 282


>gi|328947395|ref|YP_004364732.1| alpha amylase catalytic region [Treponema succinifaciens DSM 2489]
 gi|328447719|gb|AEB13435.1| alpha amylase catalytic region [Treponema succinifaciens DSM 2489]
          Length = 1214

 Score = 34.7 bits (78), Expect = 4.9,   Method: Composition-based stats.
 Identities = 23/83 (27%), Positives = 36/83 (43%), Gaps = 9/83 (10%)

Query: 11  CDLYDKKKCSNCYKNICCLILYPTGDAAAEDSRFKIDKEVKQLWDEVELPPLEAILLHFN 70
           C +Y + K  +C+KN+         D     S+ KIDK + Q  +  E PP+       +
Sbjct: 79  CMIYRRDKAPDCFKNL-------LSDLDRNFSKVKIDKLLLQFME--EFPPVNVFKKEIS 129

Query: 71  SANLLTQQKPEAGTNNQAKQRGQ 93
           +   L Q   +AGT  +   R Q
Sbjct: 130 AEEFLEQNCIDAGTKMKRSNREQ 152


  Database: All non-redundant GenBank CDS
  translations+PDB+SwissProt+PIR+PRF excluding environmental samples
  from WGS projects
    Posted date:  Jul 22, 2011  4:42 PM
  Number of letters in database: 5,058,227,080
  Number of sequences in database:  14,777,732
  
Lambda     K      H
   0.317    0.132    0.398 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 14777732
Number of Hits to DB: 1,346,317,846
Number of extensions: 48654432
Number of successful extensions: 120826
Number of sequences better than 10.0: 27
Number of HSP's gapped: 121942
Number of HSP's successfully gapped: 27
Length of query: 131
Length of database: 5,058,227,080
Length adjustment: 96
Effective length of query: 35
Effective length of database: 3,639,564,808
Effective search space: 127384768280
Effective search space used: 127384768280
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 76 (33.9 bits)