bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-24_CDS_annotation_glimmer3.pl_2_1 Length=647 Score E Sequences producing significant alignments: (Bits) Value gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 63.5 2e-07 gi|639237429|ref|WP_024568106.1| hypothetical protein 59.3 5e-06 gi|444298000|dbj|GAC77839.1| major capsid protein 59.3 6e-06 gi|649569140|gb|KDS75238.1| capsid family protein 58.2 9e-06 gi|649555287|gb|KDS61824.1| capsid family protein 58.5 9e-06 gi|649557305|gb|KDS63784.1| capsid family protein 56.2 2e-05 gi|492501782|ref|WP_005867318.1| hypothetical protein 57.4 2e-05 gi|444298142|dbj|GAC77768.1| major capsid protein 56.2 3e-05 gi|609718276|emb|CDN73650.1| conserved hypothetical protein 55.1 1e-04 gi|599088023|gb|AHN52937.1| major capsid protein 50.4 0.001 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 63.5 bits (153), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 74/271 (27%), Positives = 110/271 (41%), Gaps = 53/271 (20%) Query 357 CPSSPDRFSRLMPPGDSNS---------DVDF-TGVK-TIPQLAVATRLQEYKDLIGASG 405 P SPD F ++ G S + D++ TG +P+L + T++Q + D + SG Sbjct 15 VPYSPDLFGNIIKQGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSG 74 Query 406 SRYSDWLYTFFASKIE--HVDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGG 463 R D T + +K +V++P L +N V +A +G GE A LGQ+ Sbjct 75 GRVGDVFRTLWGTKSSAIYVNKPDFLGVWQASINPSNV--RAMANGSASGEDANLGQLAA 132 Query 464 SIS----FNTVLGREQTYYFKEPG--YIFDMMTIRPVYFWTGIRPDYLEYRGPDYFNPIY 517 + F+ G + YY KEPG + M+ P Y G+ PD D FNP Sbjct 133 CVDRYCDFSGHSGID--YYAKEPGTFMLITMLVPEPAYS-QGLHPDLASISFGDDFNPEL 189 Query 518 NDIGYQDVPLWRL-----GYG----------WKADTVSS-------LSVAKEPCYNEFRS 555 N IG+Q VP R G+ W T + +SV +E ++ R+ Sbjct 190 NGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRT 249 Query 556 SYDEVLGSLQATLTPKASTPLQSYWVQQRDF 586 Y + G A YWV R F Sbjct 250 DYSRLHGDF-------AQNGNYQYWVLTRRF 273 >gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis] Length=546 Score = 59.3 bits (142), Expect = 5e-06, Method: Compositional matrix adjust. Identities = 69/272 (25%), Positives = 113/272 (42%), Gaps = 35/272 (13%) Query 373 SNSDVDFTGVK--TIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKL 428 SN VD TI L A +LQE+ + +GSRY++ + +FF K + RP+ Sbjct 286 SNLGVDLKTASGSTINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEF 345 Query 429 LFSSSVMVNSQVVLNQAGQ-----SGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPG 483 L + + VL Q+ G G ++G+ GG F F+E G Sbjct 346 LGGNKTPILISEVLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-----------FEEHG 394 Query 484 YIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSSL 542 Y+ +M++ P + GI + ++ DYF P + IG Q V + D S Sbjct 395 YVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGEQPVYNKEIFAKNVGDYDSGG 454 Query 543 SVAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPSM 602 P Y+E++ S + G + TL +W R F SS P +++ Sbjct 455 VFGYVPRYSEYKYSPSTIHGDFKDTLY---------FWHLGRIFD----SSAPPKLNRDF 501 Query 603 LFTNLNTVNNPFA-SDMEDNFFVNMSYKVVVK 633 + N + ++ FA D D F+ ++ K+ K Sbjct 502 IEVNKSGLSRIFAVEDNSDKFYCHLYQKITAK 533 >gi|444298000|dbj|GAC77839.1| major capsid protein [uncultured marine virus] Length=480 Score = 59.3 bits (142), Expect = 6e-06, Method: Compositional matrix adjust. Identities = 57/276 (21%), Positives = 111/276 (40%), Gaps = 30/276 (11%) Query 375 SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFSSS 433 +D+ TI + A +Q Y++ GSRY+++L Y K + RP+ + + Sbjct 228 ADLQAATGGTINDIRRAFAIQRYQEARSRYGSRYTEYLRYLGVNPKDARLQRPEYMGGGT 287 Query 434 VMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTY--YFKEPGYIFDMMTI 491 +N VL + + G + + +G R Y Y +E GYI M+++ Sbjct 288 TQINFSEVLQTSPE--IPGEDQVSQFGVGDMYGHGIAAMRSNKYRRYIEEHGYIISMLSV 345 Query 492 RPVYFWT-GIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSSLSVAKEPCY 550 RP +T GI +L DY+ IG Q++ + + + + + Y Sbjct 346 RPKTMYTNGIHRSWLRLTKEDYYQKELEHIGQQEIMNNEI---YADEGAGTETFGYNDRY 402 Query 551 NEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPSM--LFTNLN 608 +E+R + V + L +YW R+F E P + F + + Sbjct 403 SEYRETPSHVSAEFRGIL---------NYWHMAREF----------EAPPVLNQSFVDCD 443 Query 609 TVNNPFASDMEDNFFVNMSYKVVVKNLINKSFATRL 644 +D ++ + +K+V + L++++ A R+ Sbjct 444 ATKRIHNEQTQDALWIMIQHKMVARRLLSRNAAPRI 479 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 58.2 bits (139), Expect = 9e-06, Method: Compositional matrix adjust. Identities = 67/274 (24%), Positives = 115/274 (42%), Gaps = 24/274 (9%) Query 312 AHPYDVQEPKVDWNNGTGTDVNIPSKVYFSATLNVPFLAAHPMA---VCPSSPDRFS--- 365 A P+ + P+V G ++ + K F+A F P++ V S+P S Sbjct 65 ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ 124 Query 366 -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA- 417 L+ P + + D GV I + + LQ + + SGSRY + + + F Sbjct 125 IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV 183 Query 418 -SKIEHVDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQT 476 S + RP+ L ++ VL Q+ S G IS G T Sbjct 184 RSSDARLQRPQFLGGGRTPISVSEVL----QTSSTDSTSPQANMAGHGISAGVNHGF--T 237 Query 477 YYFKEPGYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWK 535 YF+E GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ L Y + Sbjct 238 RYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLNE 296 Query 536 ADTVSSLSVAKEPCYNEFRSSYDEVLGSLQATLT 569 +D + + P Y E++ S +EV G + + Sbjct 297 SDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA 330 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 58.5 bits (140), Expect = 9e-06, Method: Compositional matrix adjust. Identities = 67/274 (24%), Positives = 115/274 (42%), Gaps = 24/274 (9%) Query 312 AHPYDVQEPKVDWNNGTGTDVNIPSKVYFSATLNVPFLAAHPMA---VCPSSPDRFS--- 365 A P+ + P+V G ++ + K F+A F P++ V S+P S Sbjct 216 ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ 275 Query 366 -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA- 417 L+ P + + D GV I + + LQ + + SGSRY + + + F Sbjct 276 IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV 334 Query 418 -SKIEHVDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQT 476 S + RP+ L ++ VL Q+ S G IS G T Sbjct 335 RSSDARLQRPQFLGGGRTPISVSEVL----QTSSTDSTSPQANMAGHGISAGVNHGF--T 388 Query 477 YYFKEPGYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWK 535 YF+E GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ L Y + Sbjct 389 RYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLNE 447 Query 536 ADTVSSLSVAKEPCYNEFRSSYDEVLGSLQATLT 569 +D + + P Y E++ S +EV G + + Sbjct 448 SDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA 481 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 56.2 bits (134), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 48/188 (26%), Positives = 83/188 (44%), Gaps = 10/188 (5%) Query 385 IPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDRPKLLFSSSVMVNSQVVL 442 I + + LQ + + SGSRY + + + F S + RP+ L ++ VL Sbjct 5 INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL 64 Query 443 NQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYIFDMMTIRP-VYFWTGIR 501 Q+ S G IS G T YF+E GYI +M+IRP + G+ Sbjct 65 ----QTSSTDSTSPQANMAGHGISAGVNHGF--TRYFEEHGYIMGIMSIRPRTGYQQGVP 118 Query 502 PDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSSLSVAKEPCYNEFRSSYDEVL 561 D+ ++ D++ P + +G Q++ L Y ++D + + P Y E++ S +EV Sbjct 119 KDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLNESDAANEGTFGYTPRYAEYKYSQNEVH 177 Query 562 GSLQATLT 569 G + + Sbjct 178 GDFRGNMA 185 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 57.4 bits (137), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 66/276 (24%), Positives = 118/276 (43%), Gaps = 30/276 (11%) Query 368 MPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDR 425 + P + +VD GV +I L + LQ + + SGSRY + + + F S + R Sbjct 282 LEPDNFQVNVDELGV-SINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQR 340 Query 426 PKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYI 485 P+ L ++ VL Q+ S G IS G ++ YF+E GYI Sbjct 341 PQFLGGGRTPISVSEVL----QTSATDSTSPQANMAGHGISAGVNHGFKR--YFEEHGYI 394 Query 486 FDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSSLSV 544 +M+IRP + G+ D+ ++ D++ P + +G Q++ + Y + ++ + Sbjct 395 IGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEV-YLQQTPASNNGTF 453 Query 545 AKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPSMLF 604 P Y E++ S +EV G + + ++W R F S +PN + F Sbjct 454 GYTPRYAEYKYSMNEVHGDFRGNM---------AFWHLNRIF-----SESPNL---NTTF 496 Query 605 TNLNTVNNPFAS--DMEDNFFVNMSYKVVVKNLINK 638 N N FA+ +D +++ + V L+ K Sbjct 497 VECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK 532 >gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus] Length=299 Score = 56.2 bits (134), Expect = 3e-05, Method: Compositional matrix adjust. Identities = 41/153 (27%), Positives = 70/153 (46%), Gaps = 6/153 (4%) Query 375 SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFSSS 433 +D+ G I L A LQ Y++ G+R++++L Y +S + RP+++ + Sbjct 111 ADLSQAGAININDLREAFALQRYQEARNLYGARFTEYLRYLGISSSXGRLQRPEMISTGK 170 Query 434 VMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYIFDMMTIRP 493 +N VLN G SG + LG+MGG V Y+ +E G+I +M++RP Sbjct 171 SNINFSEVLNTTGPSGV---DDHPLGEMGGH-GIAGVKSNRARYFCEEHGHIISLMSVRP 226 Query 494 -VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV 525 + T + DY+ IG ++V Sbjct 227 KTIYMTTQHKQFDRESKEDYWQKELQAIGMEEV 259 >gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis] Length=537 Score = 55.1 bits (131), Expect = 1e-04, Method: Compositional matrix adjust. Identities = 64/262 (24%), Positives = 110/262 (42%), Gaps = 35/262 (13%) Query 382 VKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKLLFSSSVMVNSQ 439 V T+ L A +LQE+ + +GSRY++ + +FF K + RP+ L + + Sbjct 288 VSTVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMIS 347 Query 440 VVLNQAGQ-----SGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYIFDMMTIRP- 493 VL Q+ G G +G+ GG + +F+E GY+ +M++ P Sbjct 348 EVLQQSATDSTTPQGNMAGHGIGIGKDGGF-----------SRFFEEHGYVIGLMSVIPK 396 Query 494 VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSSLSV-AKEPCYNE 552 + GI + + DYF P + IG Q V + + D S +V P Y+E Sbjct 397 TSYSQGIPRHFSKSDKFDYFWPQFEHIGEQPVYNKEI-FAKNIDAFDSEAVFGYLPRYSE 455 Query 553 FRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPSMLFTNLNTVNN 612 ++ S V G + L +W R F + P ++ S + + N ++ Sbjct 456 YKFSPSTVHGDFKDDLY---------FWHLGRIFD----TDKPPVLNQSFIECDKNALSR 502 Query 613 PFA-SDMEDNFFVNMSYKVVVK 633 FA D D F+ ++ K+ K Sbjct 503 IFAVEDDTDKFYCHLYQKITAK 524 >gi|599088023|gb|AHN52937.1| major capsid protein, partial [uncultured Gokushovirinae] Length=213 Score = 50.4 bits (119), Expect = 0.001, Method: Composition-based stats. Identities = 41/154 (27%), Positives = 68/154 (44%), Gaps = 11/154 (7%) Query 375 SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKLLFSS 432 +D+ TI L +A + Q + G+RY++ + F + + RP+ + Sbjct 62 ADLGDATAATINDLRLAFQTQRLLERDARGGTRYNELIRAHFGVTVPDFRIQRPEYIGGG 121 Query 433 SVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYIFDMMTIR 492 S MVN V N AGQSG G+ A+G + GS + TY E G I + +R Sbjct 122 SSMVNVTPVANTAGQSGDYVGQLGAMGTVSGS--------HDWTYSAVEHGVIIGLANVR 173 Query 493 -PVYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV 525 + + G+ + + D++ P+ IG Q V Sbjct 174 GDITYSQGLERYWSKSTRYDFYYPVLAQIGEQAV 207 Lambda K H a alpha 0.320 0.136 0.427 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4903549086528