bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-39_CDS_annotation_glimmer3.pl_2_1 Length=225 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 47.4 0.002 gi|494610271|ref|WP_007368517.1| capsid protein 47.4 0.004 gi|649569140|gb|KDS75238.1| capsid family protein 47.4 0.004 gi|649555287|gb|KDS61824.1| capsid family protein 46.2 0.009 gi|444298000|dbj|GAC77839.1| major capsid protein 45.4 0.015 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 43.5 0.058 gi|492501782|ref|WP_005867318.1| hypothetical protein 43.5 0.077 gi|444298142|dbj|GAC77768.1| major capsid protein 42.4 0.16 gi|663447907|emb|CDR46793.1| CYFA0S26e00166g1_1 42.4 0.17 gi|599088023|gb|AHN52937.1| major capsid protein 40.8 0.37 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 47.4 bits (111), Expect = 0.002, Method: Compositional matrix adjust. Identities = 27/95 (28%), Positives = 50/95 (53%), Gaps = 2/95 (2%) Query 54 TYYFKEPGYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGW 112 T YF+E GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ L Y Sbjct 92 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLN 150 Query 113 KADTVSSLSVAKEPCYNEFRSSYDEVLGSLQATLT 147 ++D + + P Y E++ S +EV G + + Sbjct 151 ESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA 185 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 47.4 bits (111), Expect = 0.004, Method: Compositional matrix adjust. Identities = 58/226 (26%), Positives = 96/226 (42%), Gaps = 40/226 (18%) Query 15 NSQVVLNQAGQSGFEGG--ESAALGQMGGSISFNTVLGREQTYYFKEPGYIFDMMTIRPV 72 N V+ QS F+ G ES LG +GG ++ + KE G I + ++ P Sbjct 300 NPVVISEVVNQSEFDRGADESPCLGDLGGK-GVGSLNSSSIDFDVKEHGIIMCIYSVVPQ 358 Query 73 YFWTG--IRPDYLEYRGPDYFNPIYNDIGYQ------------DVPL-------WRLGYG 111 + G P + R D+F P + D+GYQ D P+ RL G Sbjct 359 TEYNGTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAG 418 Query 112 WKADTVSSLS--VAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQR-DFYLIG 168 + ++ + + + + YNE+++S D V G ++ L+ SYW R DF G Sbjct 419 YPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLS-------LSYWCSPRYDFGFDG 471 Query 169 LSSNPNEV----SPSMLFTNLNTVNNPF--ASDMEDNFFVNMSYKV 208 + + V SP+ + N + +N F ++ D+F VN + V Sbjct 472 KAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV 517 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 47.4 bits (111), Expect = 0.004, Method: Compositional matrix adjust. Identities = 41/166 (25%), Positives = 74/166 (45%), Gaps = 21/166 (13%) Query 54 TYYFKEPGYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGW 112 T YF+E GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ L Y Sbjct 237 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLN 295 Query 113 KADTVSSLSVAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSN 172 ++D + + P Y E++ S +EV G + + ++W R F Sbjct 296 ESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNM---------AFWHLNRIF-----KEK 341 Query 173 PNEVSPSMLFTNLNTVNNPFAS--DMEDNFFVNMSYKVVVKNLINK 216 PN + F N N FA+ +D ++V + + L+ K Sbjct 342 PNL---NTTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 384 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 46.2 bits (108), Expect = 0.009, Method: Compositional matrix adjust. Identities = 27/95 (28%), Positives = 50/95 (53%), Gaps = 2/95 (2%) Query 54 TYYFKEPGYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGW 112 T YF+E GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ L Y Sbjct 388 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEL-YLN 446 Query 113 KADTVSSLSVAKEPCYNEFRSSYDEVLGSLQATLT 147 ++D + + P Y E++ S +EV G + + Sbjct 447 ESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA 481 >gi|444298000|dbj|GAC77839.1| major capsid protein [uncultured marine virus] Length=480 Score = 45.4 bits (106), Expect = 0.015, Method: Compositional matrix adjust. Identities = 44/227 (19%), Positives = 89/227 (39%), Gaps = 29/227 (13%) Query 1 VDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTY--YFK 58 + RP+ + + +N VL + + G + + +G R Y Y + Sbjct 277 LQRPEYMGGGTTQINFSEVLQTSPE--IPGEDQVSQFGVGDMYGHGIAAMRSNKYRRYIE 334 Query 59 EPGYIFDMMTIRPVYFWT-GIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTV 117 E GYI M+++RP +T GI +L DY+ IG Q++ + + + Sbjct 335 EHGYIISMLSVRPKTMYTNGIHRSWLRLTKEDYYQKELEHIGQQEIMNNEI---YADEGA 391 Query 118 SSLSVAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVS 177 + + Y+E+R + V + L +YW R+F E Sbjct 392 GTETFGYNDRYSEYRETPSHVSAEFRGIL---------NYWHMAREF----------EAP 432 Query 178 PSM--LFTNLNTVNNPFASDMEDNFFVNMSYKVVVKNLINKSFATRL 222 P + F + + +D ++ + +K+V + L++++ A R+ Sbjct 433 PVLNQSFVDCDATKRIHNEQTQDALWIMIQHKMVARRLLSRNAAPRI 479 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 43.5 bits (101), Expect = 0.058, Method: Compositional matrix adjust. Identities = 53/192 (28%), Positives = 74/192 (39%), Gaps = 40/192 (21%) Query 1 VDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSIS----FNTVLGREQTYY 56 V++P L +N V +A +G GE A LGQ+ + F+ G + YY Sbjct 94 VNKPDFLGVWQASINPSNV--RAMANGSASGEDANLGQLAACVDRYCDFSGHSGID--YY 149 Query 57 FKEPG--YIFDMMTIRPVYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRL-----G 109 KEPG + M+ P Y G+ PD D FNP N IG+Q VP R G Sbjct 150 AKEPGTFMLITMLVPEPAYS-QGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRG 208 Query 110 YG----------WKADTVSS-------LSVAKEPCYNEFRSSYDEVLGSLQATLTPKAST 152 + W T + +SV +E ++ R+ Y + G A Sbjct 209 FNFTGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDYSRLHGDF-------AQN 261 Query 153 PLQSYWVQQRDF 164 YWV R F Sbjct 262 GNYQYWVLTRRF 273 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 43.5 bits (101), Expect = 0.077, Method: Compositional matrix adjust. Identities = 50/219 (23%), Positives = 92/219 (42%), Gaps = 27/219 (12%) Query 1 VDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEP 60 + RP+ L ++ VL Q+ S G IS G ++ YF+E Sbjct 338 LQRPQFLGGGRTPISVSEVL----QTSATDSTSPQANMAGHGISAGVNHGFKR--YFEEH 391 Query 61 GYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPLWRLGYGWKADTVSS 119 GYI +M+IRP + G+ D+ ++ D++ P + +G Q++ + Y + ++ Sbjct 392 GYIIGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEV-YLQQTPASNN 450 Query 120 LSVAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPS 179 + P Y E++ S +EV G + + ++W R F S +PN + Sbjct 451 GTFGYTPRYAEYKYSMNEVHGDFRGNM---------AFWHLNRIF-----SESPNL---N 493 Query 180 MLFTNLNTVNNPFAS--DMEDNFFVNMSYKVVVKNLINK 216 F N N FA+ +D +++ + V L+ K Sbjct 494 TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK 532 >gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus] Length=299 Score = 42.4 bits (98), Expect = 0.16, Method: Compositional matrix adjust. Identities = 28/104 (27%), Positives = 47/104 (45%), Gaps = 5/104 (5%) Query 1 VDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEP 60 + RP+++ + +N VLN G SG + LG+MGG V Y+ +E Sbjct 160 LQRPEMISTGKSNINFSEVLNTTGPSGV---DDHPLGEMGGH-GIAGVKSNRARYFCEEH 215 Query 61 GYIFDMMTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV 103 G+I +M++RP + T + DY+ IG ++V Sbjct 216 GHIISLMSVRPKTIYMTTQHKQFDRESKEDYWQKELQAIGMEEV 259 >gi|663447907|emb|CDR46793.1| CYFA0S26e00166g1_1 [Cyberlindnera fabianii] Length=388 Score = 42.4 bits (98), Expect = 0.17, Method: Compositional matrix adjust. Identities = 47/213 (22%), Positives = 88/213 (41%), Gaps = 39/213 (18%) Query 17 QVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEPGYI---FDMMTIRPVY 73 Q+++ Q G + F GE+AA+G ++ T Y KEPGY+ D + Sbjct 80 QILMRQRG-TKFYPGENAAIG-------------KDHTIYAKEPGYVRFYLDPFHPNRKF 125 Query 74 FWTGIRPDYLEYRGPD-YFNPIYNDIGY-----------QDVPLWRLGYGWKADTVSSLS 121 +RPD R P +F P +GY ++ L R + + + + SL Sbjct 126 IGVALRPD---LRLPTAHFEPSIRRLGYVPITDAKKAQFEENNLSRKAHLMRPEIIKSLK 182 Query 122 VAKEPCYNEFRSSYDEVLGSLQATLTPKASTPLQSYWVQQRDFYLIGLSSNPNEVSPSML 181 +E E S+Y + S+ A LT S + + R+ GL N + + + + Sbjct 183 -QREAKRQELLSTYSSQITSIVADLTESDSQLAAARLLSIRNHIKAGLPLNQAQATTTSV 241 Query 182 FTNLNTVNNPFA------SDMEDNFFVNMSYKV 208 + + +++ +D N ++++S K+ Sbjct 242 YLHDLKLDSKKGLMTTEDADASKNAYISLSNKI 274 >gi|599088023|gb|AHN52937.1| major capsid protein, partial [uncultured Gokushovirinae] Length=213 Score = 40.8 bits (94), Expect = 0.37, Method: Compositional matrix adjust. Identities = 31/104 (30%), Positives = 48/104 (46%), Gaps = 9/104 (9%) Query 1 VDRPKLLFSSSVMVNSQVVLNQAGQSGFEGGESAALGQMGGSISFNTVLGREQTYYFKEP 60 + RP+ + S MVN V N AGQSG G+ A+G + GS + TY E Sbjct 112 IQRPEYIGGGSSMVNVTPVANTAGQSGDYVGQLGAMGTVSGS--------HDWTYSAVEH 163 Query 61 GYIFDMMTIR-PVYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV 103 G I + +R + + G+ + + D++ P+ IG Q V Sbjct 164 GVIIGLANVRGDITYSQGLERYWSKSTRYDFYYPVLAQIGEQAV 207 Lambda K H a alpha 0.318 0.135 0.406 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 880107843099