bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-17_CDS_annotation_glimmer3.pl_2_1 Length=360 Score E Sequences producing significant alignments: (Bits) Value gi|492501782|ref|WP_005867318.1| hypothetical protein 85.9 4e-15 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 77.8 8e-13 gi|649557305|gb|KDS63784.1| capsid family protein 75.5 2e-12 gi|547920049|ref|WP_022322420.1| capsid protein VP1 77.8 2e-12 gi|649569140|gb|KDS75238.1| capsid family protein 75.5 8e-12 gi|649555287|gb|KDS61824.1| capsid family protein 75.1 1e-11 gi|494610271|ref|WP_007368517.1| capsid protein 66.2 1e-08 gi|599087961|gb|AHN52906.1| major capsid protein 62.4 3e-08 gi|599088027|gb|AHN52939.1| major capsid protein 62.8 3e-08 gi|599087475|gb|AHN52663.1| major capsid protein 62.0 4e-08 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 85.9 bits (211), Expect = 4e-15, Method: Compositional matrix adjust. Identities = 76/265 (29%), Positives = 118/265 (45%), Gaps = 16/265 (6%) Query 97 VDVTDGTLSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVS 156 V+V + +S++ L S + + R A SG Y + + + + + R + P F GG Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348 Query 157 QEIVFQEVISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDY 215 I EV+ SAT++ P +AG G++ G G + E YI+ I SI PR Y Sbjct 349 TPISVSEVLQTSATDSTSPQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGY 406 Query 216 GQGNTWDTYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWIN 275 QG D D++ P +G Q+ N E YL P + G T + Sbjct 407 QQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEV-----YLQQTPASNNGTFGYTPRYAE 461 Query 276 YMTNVNRTFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDA 335 Y ++N G+F M +F LNR + S SP + TT+++ N +FA A Sbjct 462 YKYSMNEVHGDFRGNM--AFWHLNRIF----SESPNLN--TTFVECNPSNRVFATAETSD 513 Query 336 MNFWVQTKFEIKARRLISAKQIPNL 360 +W+Q ++KA RL+ P L Sbjct 514 DKYWIQLYQDVKALRLMPKYGTPML 538 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 77.8 bits (190), Expect = 8e-13, Method: Compositional matrix adjust. Identities = 87/335 (26%), Positives = 140/335 (42%), Gaps = 48/335 (14%) Query 65 GLCLKTYNSDLYQNWINTEWIEGVDGINEASAVDV---TDGTLSMDALNLSQKVYNFLNR 121 GL Y+ DL+ N I V+ I +A+D+ T ++++ L L K+ N+++R Sbjct 11 GLLSVPYSPDLFGNIIKQGSSPAVE-IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDR 69 Query 122 IAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEVISNSATENEPLGTLAGR 181 + VSGG D T++ + P F G V+Q I+ S G+ +G Sbjct 70 LFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPSNVRAMANGSASGE 123 Query 182 GVTTGRQKG---------GH--IRIKITEPCYIMCICSITPRIDYGQGNTWDTYLETMDD 230 G+ GH I EP M I + P Y QG D + D Sbjct 124 DANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQGLHPDLASISFGD 183 Query 231 WHKPALDGIGYQ----------------DSLNGERAWWTDY----LTADPDLKRTSAGKT 270 P L+GIG+Q L+ E + W + + DP++ S G+ Sbjct 184 DFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNM--VSVGEE 241 Query 271 VAWINYMTNVNRTFGNFAPGMSESFMVLNRNYSM----NNSASPQIEDLT-TYIDPVKFN 325 VAW T+ +R G+FA + + VL R ++ + + Q + T TYI+P+ + Sbjct 242 VAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYINPLDWQ 301 Query 326 YIFADANIDAMNFWVQTKFEIKARRLISAKQIPNL 360 Y+F D + A NF F++ +SA +P L Sbjct 302 YVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 75.5 bits (184), Expect = 2e-12, Method: Compositional matrix adjust. Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%) Query 104 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 163 ++++ + S + + R A SG Y + + + + + R + P F GG I E Sbjct 3 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 62 Query 164 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 222 V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D Sbjct 63 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 120 Query 223 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 282 D++ P +G Q+ N E YL + G T + Y + N Sbjct 121 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 175 Query 283 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 342 G+F M+ F LNR + P + TT+++ N +FA A +WVQ Sbjct 176 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 227 Query 343 KFEIKARRLISAKQIPNL 360 +IKA RL+ P L Sbjct 228 YQDIKALRLMPKYGTPML 245 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 77.8 bits (190), Expect = 2e-12, Method: Compositional matrix adjust. Identities = 72/266 (27%), Positives = 118/266 (44%), Gaps = 18/266 (7%) Query 97 VDVTDGTLSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVS 156 V+V + ++++ L S + + R A G Y + + + + + R + P F GG Sbjct 304 VNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 363 Query 157 QEIVFQEVISNSAT-ENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDY 215 I EV+ S+T E P +AG G++ G G + E YI+ I SITPR Y Sbjct 364 MPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGY 421 Query 216 GQGNTWD-TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWI 274 QG D T + MD ++ P + Q+ N E +++ D + G T + Sbjct 422 QQGVPRDFTKFDNMD-FYFPEFAHLSEQEIKNQEL-----FVSEDAAYNNGTFGYTPRYA 475 Query 275 NYMTNVNRTFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANID 334 Y + + G+F +S F LNR + P + TT+++ N +FA + + Sbjct 476 EYKYHPSEAHGDFRGNLS--FWHLNRIFE----DKPNLN--TTFVECKPSNRVFATSETE 527 Query 335 AMNFWVQTKFEIKARRLISAKQIPNL 360 FWVQ ++KA RL+ P L Sbjct 528 DDKFWVQMYQDVKALRLMPKYGTPML 553 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 75.5 bits (184), Expect = 8e-12, Method: Compositional matrix adjust. Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%) Query 104 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 163 ++++ + S + + R A SG Y + + + + + R + P F GG I E Sbjct 148 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 207 Query 164 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 222 V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D Sbjct 208 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 265 Query 223 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 282 D++ P +G Q+ N E YL + G T + Y + N Sbjct 266 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 320 Query 283 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 342 G+F M+ F LNR + P + TT+++ N +FA A +WVQ Sbjct 321 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 372 Query 343 KFEIKARRLISAKQIPNL 360 +IKA RL+ P L Sbjct 373 YQDIKALRLMPKYGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 75.1 bits (183), Expect = 1e-11, Method: Compositional matrix adjust. Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%) Query 104 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 163 ++++ + S + + R A SG Y + + + + + R + P F GG I E Sbjct 299 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 358 Query 164 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 222 V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D Sbjct 359 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 416 Query 223 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 282 D++ P +G Q+ N E YL + G T + Y + N Sbjct 417 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 471 Query 283 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 342 G+F M+ F LNR + P + TT+++ N +FA A +WVQ Sbjct 472 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 523 Query 343 KFEIKARRLISAKQIPNL 360 +IKA RL+ P L Sbjct 524 YQDIKALRLMPKYGTPML 541 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 66.2 bits (160), Expect = 1e-08, Method: Compositional matrix adjust. Identities = 63/248 (25%), Positives = 106/248 (43%), Gaps = 34/248 (14%) Query 144 ERCETPMFEGGVSQEIVFQEVISNS-----ATENEPLGTLAGRGVTTGRQKGGHIRIKIT 198 R F GG +V EV++ S A E+ LG L G+GV G I + Sbjct 287 SRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGV--GSLNSSSIDFDVK 344 Query 199 EPCYIMCICSITPRIDYGQGNTWDTYLETM--DDWHKPALDGIGYQDSLNGE--RAWWTD 254 E IMCI S+ P+ +Y G +D + + +D+ +P +GYQ + + + + Sbjct 345 EHGIIMCIYSVVPQTEY-NGTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDN 403 Query 255 YLTADPD-LKRTSAGKTVAWIN--------------YMTNVNRTFGNFAPGMSESFMVLN 299 + P+ KR +AG ++ I Y T+ + FG F G+S S+ Sbjct 404 PVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLSYWCSP 463 Query 300 R-NYSMNNSASPQI------EDLTTYIDPVKFNYIFADANIDAMNFWVQTKFEIKARRLI 352 R ++ + A + Y++P N IF + + A +F V + F++KA R + Sbjct 464 RYDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDVKAVRPM 523 Query 353 SAKQIPNL 360 S + L Sbjct 524 SVSGLAGL 531 >gi|599087961|gb|AHN52906.1| major capsid protein, partial [uncultured Gokushovirinae] Length=210 Score = 62.4 bits (150), Expect = 3e-08, Method: Compositional matrix adjust. Identities = 47/144 (33%), Positives = 65/144 (45%), Gaps = 3/144 (2%) Query 105 SMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEV 164 +++ L + ++ L R A SG Y + ++ + G N+M+ P F GG S I V Sbjct 68 TINQLRQAFQIQKLLERDARSGTRYSEIVKAHF-GVNFMDVTYRPEFLGGTSTPINVTSV 126 Query 165 ISNSATENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWDTY 224 S + P GTLA G T GG TE C +M I S+ + Y QG Sbjct 127 PQTSESGTTPQGTLAAFGTAT--INGGGFTKSFTEHCIVMGIASVRADLTYQQGLNRMFS 184 Query 225 LETMDDWHKPALDGIGYQDSLNGE 248 T D++ PAL IG Q LN E Sbjct 185 RSTRYDFYFPALAHIGEQSVLNKE 208 >gi|599088027|gb|AHN52939.1| major capsid protein, partial [uncultured Gokushovirinae] Length=219 Score = 62.8 bits (151), Expect = 3e-08, Method: Compositional matrix adjust. Identities = 47/144 (33%), Positives = 65/144 (45%), Gaps = 3/144 (2%) Query 105 SMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEV 164 +++ L + ++ L R A SG Y + ++ + G N+M+ P F GG S I V Sbjct 77 TINQLRQAFQIQKLLERDARSGTRYAEIVKAHF-GVNFMDVTYRPEFLGGTSTPINVTSV 135 Query 165 ISNSATENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWDTY 224 S + P GTLA G T GG TE C +M I S+ + Y QG Sbjct 136 PQTSESGTTPQGTLAAFGTAT--VNGGGFTKSFTEHCIVMGIASVRADLTYQQGLNRMFS 193 Query 225 LETMDDWHKPALDGIGYQDSLNGE 248 T D++ PAL IG Q LN E Sbjct 194 RSTRYDFYFPALAHIGEQAVLNKE 217 >gi|599087475|gb|AHN52663.1| major capsid protein, partial [uncultured Gokushovirinae] Length=210 Score = 62.0 bits (149), Expect = 4e-08, Method: Compositional matrix adjust. Identities = 47/144 (33%), Positives = 65/144 (45%), Gaps = 3/144 (2%) Query 105 SMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEV 164 +++ L + ++ L R A SG Y + ++ + G N+M+ P F GG S I V Sbjct 68 TINQLRQAFQIQKLLERDARSGTRYSEIVKAHF-GVNFMDVTYRPEFLGGTSTPINVTSV 126 Query 165 ISNSATENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWDTY 224 S + P GTLA G T GG TE C +M I S+ + Y QG Sbjct 127 PQTSESGTTPQGTLAAFGTAT--INGGGFTKSFTEHCILMGIASVRADLTYQQGLNRMFS 184 Query 225 LETMDDWHKPALDGIGYQDSLNGE 248 T D++ PAL IG Q LN E Sbjct 185 RSTRYDFYFPALAHIGEQSVLNKE 208 Lambda K H a alpha 0.317 0.133 0.407 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 2164993027482