EBOV Presentation and microbiology .pptx

MohamedHasan816582 15 views 51 slides Mar 06, 2025
Slide 1
Slide 1 of 51
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51

About This Presentation

EBOV Presentation and microbiology .pptx


Slide Content

Prediction and analysis of Ebola Virus protein by bioinformatics tools Yasmen Mahmoud El-Shamy 1

Supervisors : 2

INTRODUCTION 3

Ebolavirus structure indicating various proteins and the genes that code for them. The genome displays the following structure: 3-leader → nucleoprotein (NP) gene → viral protein (VP) 35 gene → VP40 gene → glycoprotein (GP) gene → VP30 gene → VP24 gene → polymerase (L) gene → 5-trailer 4

The L protein of Ebola virus (EBOV) The L protein is 2212 amino acids in length and is the largest protein encoded by the EBOV. It is an RNA-dependent RNA polymerase ( RdRp ) and forms the RdRp complex with VP30 that is responsible for viral genome transcription and replication. The L protein of EBOV of 2212 amino acids ( accession number Q05318 ) was retrieved from UniProt Knowledgebase ( UniProtKB ) and used in this study. 5

Protein structure prediction proteomic tool for understanding phenomena in modern molecular and cell biology. has important applications in biotechnology and medicine. sequence similarity searches, multiple sequence alignments, identification and characterization of domains, secondary structure prediction, solvent accessibility prediction, automatic protein fold recognition, constructing three-dimensional models to atomic detail, and model validation. 6

Three approaches of protein structure prediction 7

Objective 8

PROPOSED WORK 9

The L p rotein of EBOV of 2212 amino a cids ( Accession Number Q05318 ) was r etrieved f rom niprot Knowledgebase ( Uniprotkb ) and u sed i n t his s tudy. 10

11

Analysis of the L protein includes the prediction of: 12

13

Results 14

15

Multiple Sequence Alignment 16

The Predicted Conserved Regions Of L Protein And Average Entropy (Hx). 17 Number of Region Position Sequence length Average entropy (Hx) 1 1-78 78 0.0567 2 80-129 50 0.0553 3 131-196 66 0.0566 4 216-258 43 0.0609 5 260-336 77 0.0563 6 347-613 267 0.0555 7 615-691 77 0.0545 8 693-758 66 0.0518 9 760-873 114 0.0618 10 875-943 69 0.0556 11 945-1059 115 0.0531 12 1061-1141 81 0.0565 13 1143-1182 40 0.0560 14 1184-1382 199 0.0540 15 1384-1404 21 0.0507 16 1406-1561 156 0.0560 17 1563-1593 31 0.0560 18 1616-1653 38 0.0560 19 1709-1728 20 0.0627 20 1758-1773 16 0.0551 21 1775-1823 49 0.0560 22 1827-1871 45 0.0485 23 1873-1887 15 0.0560 24 1889-1913 25 0.0605 25 1920-1943 24 0.0537 26 1952-2027 76 0.0523 27 2029-2049 21 0.0560 28 2052-2084 33 0.0607 29 2086-2210 125 0.0550

18

phylogenetic analysis 19 Maximum Likelihood method Neighbor - Joining (NJ) method UPGMA method

20

Domain separation Using Threadom , L protein had three domains (1-392), (393-803) and (804-2212) with Cutoff 0.56, Score1176.74 Bits, E-Value 0e+00 21

In addition, there were three conserved domains: Mononeg_RNA_pol (10-1089), Mononeg_mRNAcap (1105-1357) and paramyx_RNAcap (1215-2204) resulted from NCBI Conserved Domain. Moreover, there are already two conserved domains on UniProtKB : RdRp catalytic (625-809) and Mononegavirus-type SAM-dependent 2'-O-Mtase (1805-2003). 22

23

Predicted secondary structure of five domains of L protein of Ebola Virus. 24 Domains 2ry structure (RePROF) Solvent Accessibility (RePROF) Disordered regions (Meta Disorder) Strand Helix Exposed Buried Disordered Region Domain 1 (1-392) 12% 43% 31%  39%  2% Domain 2 (394-803) 10% 40%  33%   36% 3% Domain 3 (804 - 2212) 7% 37%  35% 38%  12% RDRP catalytic (625 - 809) 18% 38%   26%  44% 1% Mononegavirus-type SAM-dependent 2'-O-MTase (1805 - 2003) 21% 29%  29% 39%  1%

3-D Structure Prediction: 25

26

T able 1: Evaluation of 3-D structure model from selected servers for domain 1 of L protein structure prediction. 27 Domain 1 (1-392) RMSD TM-Score GDT-TS GDT-HA   QMEAN Z-Scores Q-mean MoIProbity Clash Score Ramachandran Favoured Overall Quality Factor Aligned Length 3Drefine I-TASSER 0.231 0.81778 0.0309 0.9942 0.44 -7.14 3.547 39.81 80.98% 92.6893 342 LOMETS 0.166 0.74749 0.0296 1.0000 0.41 -4.73 3.094 84.70 89.46% 62.7297 340 SWISS-MODEL 0.146 0.67595 0.0237 1.0000 0.43 -4.54 2.189 13.50 89.74% 81.7935 316 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 0.181 0.82248 0.0092 1.0000 0.65 -1.72 2.677 28.49 95.03% 80.7018 171 Modrefiner I-TASSER 1.285 0.9761 0.0314 0.0156 0.44 -6.47 3.02 72.91 89.20% 84.0731 341 LOMETS 1.099 0.9800 0.0297 0.0153 0.42 -4.63 3.30 79.50 91.52% 73.3681 340 SWISS-MODEL 1.259 0.9753 0.0237 0.0130 0.41 -5.08 2.65 50.31 92.37% 74.8649 312 GALAXY WEB 0.920 0.9858 0.0260 0.0143 0.45 -3.85 2.45 46.98 95.63% 76.1155 322 PHYRE2 0.929 0.9865 0.0293 0.0154 0.44 -5.92 2.63 49.30 92.70% 86.5672 343 Galaxy Refine I-TASSER 0.515 0.1085 0.0306 0.9143 0.46 -5.74 2.652 27.3 89.72% 82.058 342 LOMETS 0.472 0.1044 0.0292 0.9271 0.42 -4.19 2.492 32.1 92.54% 77.3333 340 SWISS-MODEL 0.408 0.67798 0.0239 0.9555 0.43 -4.43 2.044 13.4 94.74% 74.5902 314 GALAXY WEB 0.316 0.77988 0.0263 0.9808 0.45 -3.55 2.226 16.6 92.54% 78.5146 322 PHYRE2 0.474 0.93164 0.0293 0.9316 0.44 -5.19 1.992 12.6 94.38% 95.2681 343

Table 2: E valuation of 3-D structure model from selected servers for domain 2 of L protein structure prediction. 28 Domain 2 (393-803) RMSD TM-Score GDT-TS GDT-HA QMEAN Z-Scores Q-mean MoIProbity Clash Score Ramachandran Favoured Overall Quality Factor Aligned Length 3Drefine I-TASSER 0.188 0.81881 1.0000 0.9985 0.57 -5.90 3.112 24.69 85.66% 90.4573 383 LOMETS 0.149 0.0470 1.0000 1.0000 0.0140 0.60 -1.91 2.367 43.09 96.17% 67.7966 183 SWISS-MODEL 0.144 0.95026 1.0000 1.0000 0.63 -3.93 2.103 15.61 93.12% 87.6494 392 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 0.278 0.99327 0.9985 0.9965 0.62 -3.71 2.448 60.69 93.44% 63.9583 393 Modrefiner I-TASSER 0.766 0.9913 0.0181 0.0105 0.55 -5.45 2.97 66.19 90.37% 80.9145 386 LOMETS 0.768 0.9913 0.0188 0.0113 0.54 -4.24 3.06 74.30 92.73% 66.998 385 SWISS-MODEL 0.550 0.9954 0.0190 0.0111 0.62 -4.14 2.62 51.19 95.68% 67.8 393 GALAXY WEB 0.620 0.9943 0.0196 0.0107 0.56 -5.55 2.69 47.92 94.11% 68.9379 378 PHYRE2 1.377 0.9808 0.0186 0.0114 0.59 -5.69 2.81 62.64 93.09% 71.134 392 Galaxy Refine I-TASSER 0.442 0.82056 0.0185 0.9472 0.59 -4.08 2.310 19.6 91.75% 87.6494 383 LOMETS 0.430 0.82132 0.0192 0.9550 0.57 -4.28 2.248 22.9 94.11% 82.0359 384 SWISS-MODEL 0.349 0.95544 0.0190 0.9785 0.63 -3.78 1.882 13.9 96.46% 82.1285 393 GALAXY WEB 0.292 0.84226 0.0196 0.9863 0.58 -4.39 2.064 13.5 93.52% 80.0821 378 PHYRE2 .. .. .. .. .. .. .. .. .. .. ..

Table 3: Evaluation of 3-D structure model from selected servers for domain 3 of L protein structure prediction. 29 Domain 3 (804-2212) RMSD TM-Score GDT-TS GDT-HA QMEAN Z-Scores Q-mean MoIProbity Clash Score Ramachandran Favoured Overall Quality Factor Aligned Length 3Drefine I-TASSER 0.226 0.49377 0.9996 0.9933 0.41 -8.93 3.539 36.42 77.83% 86.5096 855 LOMETS 0.245 0.57134 1.0000 0.9911 0.39 -10.58 3.862 167.87 82.87% 45.5587 1009 SWISS-MODEL 0.154 0.77260 1.0000 0.9995 0.38 -6.64 2.729 19.87 85.61% 84.9964 1115 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 0.834 0.93152 0.9881 0.9790 0.39 -7.52 3.252 89.30 85.27% 57.4324 1133 Modrefiner I-TASSER 1.146 0.9913 0.0160 0.0101 0.40 -9.92 3.53 107.10 79.39% 61.8808 856 LOMETS 6.930 0.9487 0.0176 0.0107 0.37 -9.68 3.77 175.35 83.08% 45.977 987 SWISS-MODEL 0.536 0.9980 0.0159 0.0093 0.37 -7.53 3.05 67.43 85.54% 61.9614 1115 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 6.660 0.8122 0.0153 0.0095 0.38 -8.99 3.51 94.86 84.07% 56.0345 1104 Galaxy Refine I-TASSER 0.472 0.1349 0.0160 0.9296 0.42 -7.01 2.793 29.4 86.57% 71.8545 848 LOMETS 0.635 0.1444 0.0172 0.8522 0.38 -9.42 3.478 77.7 85.29% 59.8837 1008 SWISS-MODEL 0.410 0.77195 0.0160 0.9525 0.38 -5.83 2.239 17.0 91.34% 74.0377 1115 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 .. .. .. .. .. .. .. .. .. .. ..

Table 4: Evaluation of 3-D structure model from selected servers for RDRP catalytic of L protein structure prediction. 30 RDRP catalytic (625-809) RMSD TM-Score GDT-TS GDT-HA   QMEAN Z-Scores Q-mean MoIProbity Clash Score   Ramachandran Favoured Overall Quality Factor Aligned Length 3Drefine I-TASSER 0.158 0.94036 1.0000 1.0000 0.66 -2.01 2.678 15.29 90.16% 96.0452 184 LOMETS 0.149 0.81601 1.0000 1.0000 0.60 -1.91 2.367 43.09 96.17% 67.7966 183 SWISS-MODEL 0.159 0.94739 1.0000 1.0000 0.61 -2.00 2.212 11.65 94.44% 86.7089 181 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 0.159 0.95057 1.0000 1.0000 0.65 -1.72 2.080 28.49 95.03% 80.7018 182 Modrefiner I-TASSER 0.539 0.9931 0.0181 0.0135 0.65 -2.43 2.09 36.50 97.81% 84.8837 184 LOMETS 0.500 0.9907 0.0188 0.0137 0.61 -2.22 2.34 39.97 96.17% 73.0994 183 SWISS-MODEL 0.958 0.9798 0.0177 0.0129 0.64 -2.35 1.95 28.95 99.44% 82.8025 181 GALAXY WEB 0.397 0.9941 0.0186 0.0131 0.64 -2.94 2.18 30.94 96.72% 74.4318 166 PHYRE2 0.484 0.9916 0.0177 0.0127 0.63 -2.22 2.10 36.93 97.79% 75 182 Galaxy Refine I-TASSER 0.410 0.93666 0.0181 0.9527 0.66 -1.14 1.516 8.8 97.81% 95 184 LOMETS 0.405 0.80686 0.0183 0.9581 0.61 -1.91 1.642 13.6 98.91% 80 183 SWISS-MODEL 0.435 0.94677 0.0179 0.9492 0.61 -2.69 1.524 6.9 97.22% 97.7612 181 GALAXY WEB 0.323 0.76512 0.0186 0.9838 0.65 -1.97 1.611 12.6 98.36% 86.0606 167 PHYRE2 0.414 0.95157 0.0181 0.9563 0.66 -1.97 1.571 11.4 98.90% 93.6709 182

Table 5: Evaluation of 3-D structure model from selected servers for mononegavirus-type sam -dependent 2'-o-mtase of L protein structure prediction. 31 Mononegavirus-type SAM-dependent 2'-O-MTase (1805-2003) RMSD TM-Score GDT-TS GDT-HA   QMEAN Z-Scores Q-mean MoIProbity Clash Score Ramachandran Favoured Overall Quality Factor Aligned Length 3Drefine I-TASSER 0.164 0.71961 1.0000 1.0000 0.54 -5.52 2.533 19.43 82.23% 88.4817 162 LOMETS 0.144 0.70206 1.0000 1.0000 0.47 -4.32 3.050 70.22 90.86% 61.2565 162 SWISS-MODEL 0.172 0.67937 1.0000 1.0000 0.38 -4.73 2.671 32.17 89.47% 76.7956 156 GALAXY WEB .. .. .. .. .. .. .. .. .. .. .. PHYRE2 0.185 0.65253 1.0000 0.9975 0.43 -5.79 2.624 67.42 89.34% 61.828 161 Modrefiner I-TASSER 0.388 0.9946 0.2097 0.1181 0.54 -5.69 3.12 65.83 89.34% 82.8877 162 LOMETS 1.075 0.9841 0.1917 0.1069 0.49 -4.13 2.81 56.43 95.43% 81.1518 192 SWISS-MODEL 0.521 0.9901 0.1514 0.0750 0.38 -4.46 2.75 63.07 95.79% 66.2921 156 GALAXY WEB 0.430 0.9938 0.1736 0.0972 0.45 -4.70 2.52 59.87 96.45% 68.254 162 PHYRE2 0.752 0.9827 0.1792 0.0986 0.44 -5.91 2.54 56.74 95.94% 74.0331 161 Galaxy Refine I-TASSER 0.386 0.72060 0.2042 0.9548 0.54 -5.25 2.365 17.3 88.32% 75.9563 162 LOMETS 0.463 0.70486 0.1917 0.9372 0.51 -4.08 2.096 19.1 95.43% 81.4208 192 SWISS-MODEL 0.446 0.68502 0.1514 0.9414 0.40 -4.17 2.162 17.6 93.68% 80.814 156 GALAXY WEB 0.302 0.68684 0.1792 0.9862 0.45 -3.90 2.193 22.2 95.43% 87.6344 162 PHYRE2 0.451 0.65247 0.1750 0.9309 0.44 -4.62 2.003 13.0 94.42% 72.9885 161

The TM value of selected refined model was larger than 0.5 this meant that the model was very accurate. 3-D Structure Prediction, The best model was obtained from I-TASSER which was arranged as the best method in the server section of the recent CASP15 (2022) experiment. I-TASSER models had low value of RMSD which refer to high accuracy model 32

33

Table 6: Motif analysis of EBOV L protein using Pfam server. 34 Domains Pfam Position Independent E-value Pfam ID Description Domain 1 (1-392) Mononeg_RNA_pol 10-391 5.9e-80 PF00946 Mononegavirales RNA dependent RNA polymerase Domain 2 (393-803) Mononeg_RNA_pol 1-410 3.2e-164 PF00946 Mononegavirales RNA dependent RNA polymerase Domain 3 (804-2212) Mononeg_RNA_pol 1-286 9.4e-92 PF00946 Mononegavirales RNA dependent RNA polymerase Mononeg_mRNAcap 302-554 1.6e-89 PF14318 Mononegavirales mRNA-capping region V FtsJ 1098-1204 0.038 PF01728 FtsJ -like methyltransferase RDRP catalytic (625-809) Mononeg_RNA_pol 3-185 6.7e-80 PF00946 Mononegavirales RNA dependent RNA polymerase FBO_C 88-140 0.083 PF19270 F-box only protein C-terminal region Ribosomal_L18p 102-179 0.14 PF00861 Ribosomal L18 of archaea, bacteria, mitoch . and chloroplast Mononegavirus-type SAM-dependent 2'-O-MTase (1805-2003) FtsJ 85-198 0.0037 PF01728 FtsJ -like methyltransferase

Table 7: Post-translational modification site prediction using Motif Scan server. 35 Category Signature Domain 1 (1-392) Domain 2 (393-803) Domain 3 (804-2212) RDRP catalytic (625-809) Mononegavirus-type SAM-dependent 2'-O-MTase (1805-2003) Matching positions RNA Associated Protein Domain Posttranslational Modifications ASN_GLYCOSYLATION 249-252 and 328-331 133-136, 155-158 and 293-296 156-159, 261-264, 376-379, 446-449, 590-593, 634-637, 863-866, 976-979, 1092-1095, 1132-1135 and 1341-1344 61-64 91-94 and 131-134 CK2_PHOSPHO_SITE 107-110, 232-235 and 330-333 95-98, 100-103, 159-162, 183-186, 205-208, 222-225, 239-242, 333-336, 364-367 and 372-375 39-42, 44-47, 158-161, 284-287, 383-386, 393-396, 431-434, 448-451, 495-498, 505-508, 562-565, 617-620, 636-639, 642-645, 810-813, 876-879, 946-949, 974-977, 980-983, 993-996, 1029-1032, 1059-1062 and 1163-1166 7-10, 101-104, 132-135 and 140-143 28-31, 58-61 and 162-165 MYRISTYL 30-35, 167-172 and 204-209 32-37, 192-197 and 314-319 32-37, 155-160, 429-434, 501-506, 585-590, 660-665, 666-671, 732-737, 739-744, 859-864, 883-888, 1005-1010 and 1342-1347 82-87 4-9 PKC_PHOSPHO_SITE 159-161, 245-247 and 313-315 39-41, 79-81, 120-122, 159-161, 205-207, 231-233 and 308-310 12-14, 65-67, 192-194, 237-239, 505-507, 680-682, 703-705, 710-712, 801-803, 824-826, 847-849, 876-878, 884-886, 946-948, 955-957, 1023-1025, 1071-1073, 1191-1193, 1207-1209, 1218-1220, 1274-1276 and 1316-1318 76-78 22-24, 70-72 and 190-192 Paramyx_RNA_pol 10-391 1-410 1-397 1-185 .. Phage_fiber 220-233 .. .. .. .. tRNA-synt_1e 372-391 .. .. .. .. RDRP_MONONEGAVIRALES .. 233-363 .. 1-131 .. BIG1 .. .. .. .. .. NHL .. 350-360 .. 118-128 .. PAZ .. 284-302 .. 52-70 .. RDRP_SSRNA_NEG_NONSEG .. 233-410 .. 1-185 .. AMIDATION .. .. 237-240 and 599-602 177-180   CAMP_PHOSPHO_SITE .. .. 272-275, 681-684 and 897-900 .. .. LEUCINE_ZIPPER .. .. 678-699 .. .. TYR_PHOSPHO_SITE .. .. 572-578, 682-690, 992-1000 and 1135-1141 .. 134-140 PFTA .. .. 1343-1380 .. .. PPASE_TENSIN .. .. 678-1057 .. .. PMBR .. .. 1107-1120 .. 106-119 FtsJ .. .. 1006-1206 .. ..

The best predicted model of three-dimensional Structure using I-TASSER server. The cartoon view showed known and predicted motifs: The Mononeg_RNA_pol (10-391) was highlighted in blue, Mononeg_RNA_pol (1-410) was highlighted in green, Mononeg_RNA_pol (1-286) was highlighted in red, Mononeg_mRNAcap (302-555) was highlighted in orange, FtsJ (1010-1204) was highlighted in pink, Mononeg_RNA_pol (2-185) was highlighted in yellow, Ribosomal_L18p (88-184) was highlighted in White, FtsJ (6-199) was highlighted in purple and the rest of predicted model was highlighted in Gray. 36

Identification, classification and analysis of domain architectures 37 Domains Region SuperFamily Family E- value Domain 3 (804-2212) 1157-1199 S-adenosyl-L-methionine- dependent methyltransferases .. 1.23e-03 Mononegavirus-type SAM-dependent 2'-O-MTase (1805-2003) 156-198 S-adenosyl-L-methionine- dependent methyltransferases RNA methyltransferase FtsJ 0.080 133-157 Translational machinery components .. 0.078

Integrative map of predicted results for domain 1 (1-392) 38

Integrative map of predicted results for domain 2 (393-803) 39

Integrative map of predicted results for domain 3 (804-2212) 40

Integrative map of predicted results for RDRP catalytic 41

Integrative map of predicted results for mononegavirus-type sam -dependent 2'-o-mtase. 42

43

Conclusion 44

In this work, Different bioinformatics servers were used to predict and analyze the L protein Such as (I-TASSER, Phyre2, GalaxyWEB , Swiss-Model and LOMETS) to discover protein binding motifs relating to its biological functions. This study included the prediction of family conserved regions, secondary structure, solvent accessibility, tertiary structure, interaction motifs, post-translational modification sites and disordered regions. 45

Three domains (1-392), (393-803) and (804-2212) were predicted and a fairly best model had been obtained from I-TASSER server after refinement and energy minimization according to best value of C-score, RMSD, TM-score. In addition, there were three conserved domains resulted from NCBI Conserved Domain. Moreover, there are already two conserved domains on UniProtKB : RdRp catalytic and Mononegavirus-type SAM-dependent 2'-O-Mtase. 46

Secondary structure prediction of L protein, Domain 1 composed of Alpha helix 43% and Beta strand 12%. In addition, Domain 2 composed of Alpha helix 40% and Beta strand 10%. Moreover, Domain 3 composed of Alpha helix 37% and Beta strand 7%. RDRP catalytic composed of Alpha helix 38%and Beta strand 18% and Mononegavirus-type SAM-dependent 2'-O-MTase composed of Alpha helix 29% and Beta strand 21%. 3-D Structure Prediction, The best model was obtained from I-TASSER which was arranged as the best method in the server section of the recent CASP15 (2022) experiment. Using Motif Scan database, L protein has twenty Post-translational modification site prediction signatures were retrieved all over the L protein . Homology modeling has many applications in the drug discovery process. Since drugs interact with receptors that consist mainly of proteins, protein 3D structure determination, and thus homology modeling is important in drug discovery. Accordingly, there has been the clarification of protein interactions using 3D structures of proteins that are built with homology modeling. This contributes to the identification of novel drug candidates. Homology modeling plays an important role in making drug discovery faster, easier, cheaper, and more practical.  47

Future work 48

49

50 ACKNOWLEDGMENT

THANK YOU 51
Tags