bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-29_CDS_annotation_glimmer3.pl_2_1 Length=218 Score E Sequences producing significant alignments: (Bits) Value gi|575096056|emb|CDL66947.1| unnamed protein product 364 4e-120 gi|575094544|emb|CDL65904.1| unnamed protein product 351 3e-115 gi|575094492|emb|CDL65859.1| unnamed protein product 341 3e-111 gi|575094572|emb|CDL65928.1| unnamed protein product 334 1e-108 gi|575094496|emb|CDL65862.1| unnamed protein product 315 8e-101 gi|575094431|emb|CDL65804.1| unnamed protein product 264 3e-81 gi|575094415|emb|CDL65790.1| unnamed protein product 251 4e-76 gi|444297919|dbj|GAC77754.1| major capsid protein 234 6e-73 gi|530695385|gb|AGT39938.1| major capsid protein 238 6e-72 gi|530695351|gb|AGT39907.1| major capsid protein 234 5e-70 >gi|575096056|emb|CDL66947.1| unnamed protein product [uncultured bacterium] Length=570 Score = 364 bits (935), Expect = 4e-120, Method: Compositional matrix adjust. Identities = 168/219 (77%), Positives = 191/219 (87%), Gaps = 1/219 (0%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGT-SANSTPQGNPSGQSRTTDVHSDFKKSF 59 VTSPDARLQR EYLGGNRIPI I+++ Q SGT SA++TPQG G S+TTD HSDF KSF Sbjct 352 VTSPDARLQRSEYLGGNRIPININQVIQQSGTGSASTTPQGTVVGMSQTTDTHSDFTKSF 411 Query 60 VEHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTD 119 EHGFIIGVM ARYDHTYQQG++R WSRK + DYYWPVF+NIGEQA+ NKEIYAQGN TD Sbjct 412 TEHGFIIGVMCARYDHTYQQGIDRMWSRKDKFDYYWPVFSNIGEQAIKNKEIYAQGNATD 471 Query 120 DEVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVN 179 DEVFGYQEAWA+YRYKP+RVTGEMRS QSLDVWHL DDYSKLPSLSD W++ED+ +N Sbjct 472 DEVFGYQEAWAEYRYKPSRVTGEMRSSYAQSLDVWHLADDYSKLPSLSDEWIREDAKTLN 531 Query 180 RVIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDHH 218 RV+AVS++NSNQ +ADI++KN CTR MPMYSIPGLIDHH Sbjct 532 RVLAVSDQNSNQFFADIYVKNLCTRPMPMYSIPGLIDHH 570 >gi|575094544|emb|CDL65904.1| unnamed protein product [uncultured bacterium] Length=551 Score = 351 bits (901), Expect = 3e-115, Method: Compositional matrix adjust. Identities = 164/217 (76%), Positives = 188/217 (87%), Gaps = 1/217 (0%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 VTSPDARLQRPEYLGGNRIPI I+++ Q S T++ S PQGNP GQS TTD ++DF KSFV Sbjct 335 VTSPDARLQRPEYLGGNRIPININQVLQQSETTSTS-PQGNPVGQSLTTDTNADFVKSFV 393 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHGF+IG+MVARYDHTYQQGLERFWSRK R DYYWPVFA+IGEQAVLNKEIY G DD Sbjct 394 EHGFVIGLMVARYDHTYQQGLERFWSRKDRFDYYWPVFAHIGEQAVLNKEIYTSGTAVDD 453 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQEA+ADYRYKP+RVTGEMRS APQSLDVWHL DDY+ LPSLSDSW++E ++ V+R Sbjct 454 EVFGYQEAYADYRYKPSRVTGEMRSAAPQSLDVWHLADDYASLPSLSDSWIRESASTVDR 513 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDH 217 V+AVS S QL+ DI+I+N+ TR MPMYS+PGLIDH Sbjct 514 VLAVSSNVSAQLFCDIYIQNRSTRPMPMYSVPGLIDH 550 >gi|575094492|emb|CDL65859.1| unnamed protein product [uncultured bacterium] Length=551 Score = 341 bits (874), Expect = 3e-111, Method: Compositional matrix adjust. Identities = 160/218 (73%), Positives = 184/218 (84%), Gaps = 2/218 (1%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 VTSPDARLQRPEYLGG+R+PI I+++ Q+S T A TPQGN + S TTD HS+F KSFV Sbjct 336 VTSPDARLQRPEYLGGSRVPININQVIQSSETGA--TPQGNAAAYSLTTDSHSEFTKSFV 393 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHGFIIG+MVARYDH+YQQGL+RFWSRK R DYYWPVFAN+GE AV NKEI+AQG DD Sbjct 394 EHGFIIGLMVARYDHSYQQGLQRFWSRKDRFDYYWPVFANLGEMAVKNKEIFAQGTDVDD 453 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQEAWADYRYKP+ VTGEMRSQ QSLD+WHL DDY LPSLSDSW++EDS+ VNR Sbjct 454 EVFGYQEAWADYRYKPSVVTGEMRSQYAQSLDIWHLADDYENLPSLSDSWIREDSSTVNR 513 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDHH 218 V+AVS+ S QL+ DI+I+ TR MP+YSIPGLIDHH Sbjct 514 VLAVSDSVSAQLFCDIYIRCLATRPMPLYSIPGLIDHH 551 >gi|575094572|emb|CDL65928.1| unnamed protein product [uncultured bacterium] Length=556 Score = 334 bits (857), Expect = 1e-108, Method: Compositional matrix adjust. Identities = 160/218 (73%), Positives = 183/218 (84%), Gaps = 1/218 (0%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 V SPD+RLQRPEYLGGNRIPI +++I Q S ++ S P G +G S TTD +SDF KSFV Sbjct 340 VVSPDSRLQRPEYLGGNRIPINVNQIIQQSQSTEQS-PLGALAGMSVTTDKNSDFIKSFV 398 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHG+IIG++VARYDHTYQQGL+R WSRK R D+YWPV ANIGEQAVLNKEIY G+ TDD Sbjct 399 EHGYIIGLVVARYDHTYQQGLDRMWSRKDRFDFYWPVLANIGEQAVLNKEIYIDGSDTDD 458 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQEAWA+YRYKPNRV GEMRS APQSLDVWHLGDDYS LP LSDSW++ED V+R Sbjct 459 EVFGYQEAWAEYRYKPNRVCGEMRSSAPQSLDVWHLGDDYSSLPYLSDSWIREDKTNVDR 518 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDHH 218 V+AV+ S+QL+ADI+I NK TR MPMYSIPGLIDHH Sbjct 519 VLAVTSSVSDQLFADIYICNKATRPMPMYSIPGLIDHH 556 >gi|575094496|emb|CDL65862.1| unnamed protein product [uncultured bacterium] Length=568 Score = 315 bits (806), Expect = 8e-101, Method: Compositional matrix adjust. Identities = 148/218 (68%), Positives = 173/218 (79%), Gaps = 1/218 (0%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 VT DAR+Q PEYLGGNRIPI I+++ QTS TS + +PQGN +GQS T+D H DF KSF Sbjct 352 VTPLDARMQVPEYLGGNRIPININQVVQTSQTS-DVSPQGNVAGQSLTSDSHGDFIKSFT 410 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHG +IGV VARYDHTYQQG+ + WSRK R DYYWPV ANIGEQAVLNKEIYAQG D+ Sbjct 411 EHGMLIGVAVARYDHTYQQGVSKLWSRKTRFDYYWPVLANIGEQAVLNKEIYAQGTAQDE 470 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQEAWA+YRYKP+ VTGEMRS A SLD WH DDY+ LP LS W++ED ++R Sbjct 471 EVFGYQEAWAEYRYKPSIVTGEMRSSARTSLDSWHFADDYNSLPKLSADWIKEDKTNIDR 530 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDHH 218 V+AVS SNQ +AD +I+N+ TRA+P YSIPGLIDHH Sbjct 531 VLAVSSSVSNQYFADFYIENETTRALPFYSIPGLIDHH 568 >gi|575094431|emb|CDL65804.1| unnamed protein product [uncultured bacterium] Length=560 Score = 264 bits (674), Expect = 3e-81, Method: Compositional matrix adjust. Identities = 129/218 (59%), Positives = 154/218 (71%), Gaps = 3/218 (1%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 VT+ DAR+Q PEYLGG ++PI +S++ QTS S +++PQGN + S T S F KSF Sbjct 346 VTTSDARMQIPEYLGGCKVPINVSQVVQTSA-STDASPQGNTAAISVTPFSKSMFTKSFD 404 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHGFIIGV AR +YQQG+ER WSRK RLDYY+PV ANIGEQA+LNKEIYAQGN DD Sbjct 405 EHGFIIGVATARTAQSYQQGIERMWSRKDRLDYYFPVLANIGEQAILNKEIYAQGNAKDD 464 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 E FGYQEAWADYRYKPN + G RS A QSLD WH G DY KLP+LS W+++ + R Sbjct 465 EAFGYQEAWADYRYKPNTICGRFRSNAQQSLDAWHYGQDYDKLPTLSTDWMEQSDIEMKR 524 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDHH 218 +AV E A+ K R MP+YSIPGLIDH+ Sbjct 525 TLAVQTE--PDFIANFRFNCKTVRVMPLYSIPGLIDHN 560 >gi|575094415|emb|CDL65790.1| unnamed protein product [uncultured bacterium] Length=569 Score = 251 bits (640), Expect = 4e-76, Method: Compositional matrix adjust. Identities = 117/214 (55%), Positives = 150/214 (70%), Gaps = 1/214 (0%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 V+SPDARLQR EY+GG RIPI +S++ Q+S + S PQGN + S TT ++ S V Sbjct 354 VSSPDARLQRSEYIGGERIPINVSQVIQSSASDTTS-PQGNAAAYSLTTSANTIRAYSAV 412 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EHG+I+G+ R DH+YQQGL R W+R R YY P+ AN+GEQAVLN+EIYAQG D Sbjct 413 EHGYILGLAAIRVDHSYQQGLSRMWTRSDRFSYYHPMLANLGEQAVLNQEIYAQGTTADT 472 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQEAWADYRY+ N +TGEMRS QSLD WH GD Y+ LP LS+ W++E ++R Sbjct 473 EVFGYQEAWADYRYRTNMITGEMRSTYAQSLDAWHYGDKYTDLPRLSNDWIKEGQENIDR 532 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGL 214 +AV ENS+Q +++ R MP+YS+PGL Sbjct 533 TLAVQSENSHQFICNLYFDQTWVRPMPIYSVPGL 566 >gi|444297919|dbj|GAC77754.1| major capsid protein [uncultured marine virus] Length=283 Score = 234 bits (597), Expect = 6e-73, Method: Compositional matrix adjust. Identities = 115/217 (53%), Positives = 147/217 (68%), Gaps = 3/217 (1%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 V SPD+RLQRPEYLGG + I I QTS + A T QG + + F KSFV Sbjct 69 VESPDSRLQRPEYLGGGSSLVQILPIAQTSQSEATGTEQGKLTAVGYHSQSGLGFTKSFV 128 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EH IIG++ R D TYQQG++R WSRK + D+YWP AN+GEQ VLNKEI+ Q DD Sbjct 129 EHCVIIGLVNVRADLTYQQGMDRMWSRKTKYDFYWPALANLGEQTVLNKEIFTQAIAADD 188 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 EVFGYQE WA+YRY P+R+TG +RS A SLD+WHL D+ LP+L++S++QE+ V+R Sbjct 189 EVFGYQERWAEYRYFPSRITGVLRSDAAASLDLWHLSQDFGSLPALNESFIQENPP-VDR 247 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDH 217 V+AV++E + D + TR MPMYS+PGLIDH Sbjct 248 VVAVTDE--PEFIFDSYFDLITTRPMPMYSVPGLIDH 282 >gi|530695385|gb|AGT39938.1| major capsid protein [Marine gokushovirus] Length=514 Score = 238 bits (608), Expect = 6e-72, Method: Compositional matrix adjust. Identities = 121/217 (56%), Positives = 150/217 (69%), Gaps = 4/217 (2%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQTSGTSANSTPQGNPSGQSRTTDVHSDFKKSFV 60 VTSPDARLQRPEYLGG + I I+ I QTS T A +TPQGN SG T F KSF Sbjct 301 VTSPDARLQRPEYLGGGKDRININPIAQTSSTDA-TTPQGNLSGYGTTGFTGHRFNKSFT 359 Query 61 EHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGTDD 120 EH ++G+ D TYQQGL R +SR+ R D+YWP A++GEQAVLNKEIYAQG D+ Sbjct 360 EHSVVLGLACVFADLTYQQGLPRHFSRQTRWDFYWPALAHLGEQAVLNKEIYAQGTTDDN 419 Query 121 EVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVVNR 180 VFGYQE +A+YRYKP+ +TG+MRS QSLD+WHL D+ LP L+ S+++E+ V+R Sbjct 420 NVFGYQERYAEYRYKPSSITGQMRSNFAQSLDIWHLAQDFGSLPVLNSSFIEENPP-VDR 478 Query 181 VIAVSEENSNQLWADIFIKNKCTRAMPMYSIPGLIDH 217 V AV +N L D++ K KC R MP Y +PGLIDH Sbjct 479 VTAV--QNYPNLILDMYFKLKCARPMPTYGVPGLIDH 513 >gi|530695351|gb|AGT39907.1| major capsid protein [Marine gokushovirus] Length=539 Score = 234 bits (597), Expect = 5e-70, Method: Compositional matrix adjust. Identities = 112/220 (51%), Positives = 146/220 (66%), Gaps = 4/220 (2%) Query 1 VTSPDARLQRPEYLGGNRIPIVISEINQ--TSGTSANSTPQGNPSGQSRTTDVHSDFKKS 58 V SPDAR+QRPEYLGG PI+++ + Q SG S TP G F S Sbjct 320 VISPDARMQRPEYLGGGSAPIIVNPVAQQSASGASGTDTPLGTLGAVGTGLASGHGFASS 379 Query 59 FVEHGFIIGVMVARYDHTYQQGLERFWSRKGRLDYYWPVFANIGEQAVLNKEIYAQGNGT 118 F EHG ++G+ R D TYQQGL R +SR R D+++PVF+++GEQ +LNKE+YA G T Sbjct 380 FTEHGVVVGLCSVRADLTYQQGLHRMFSRSTRYDFFFPVFSHLGEQPILNKELYATGTST 439 Query 119 DDEVFGYQEAWADYRYKPNRVTGEMRSQAPQSLDVWHLGDDYSKLPSLSDSWVQEDSAVV 178 DD+VFGYQEAWA+YRYKP++VTG MRS A +LD WHL ++ LP+L+ +++ ED+ V Sbjct 440 DDDVFGYQEAWAEYRYKPSQVTGLMRSTAAGTLDAWHLAQNFGSLPTLNSTFI-EDTPPV 498 Query 179 NRVIAV-SEENSNQLWADIFIKNKCTRAMPMYSIPGLIDH 217 +RV+AV SE N Q D F R MPMYS+PGL+DH Sbjct 499 DRVVAVGSEANGQQFIFDAFFDINMARPMPMYSVPGLVDH 538 Lambda K H a alpha 0.316 0.133 0.408 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 805881880428