Statistical modeling in pharmaceutical research and development

njoecreations99 1,264 views 29 slides Jul 19, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

Descriptive vs mechanical modeling
Statistical parameter estimation
Confidence region


Slide Content

STATISTICAL MODELING IN PHARMACEUTICAL RESEARCH AND
DEVELOPMENT
Ø DESCRIPTIVE VS MECHANistic MODEL
Ø STATISTICAL PARAMETER ESTIMATION
Ø CONFIDENCE REGIONS
submitted to;
DR. ROMA MATHEW
submitted by;
NEHA JOSHY
COMPUTER AIDED DRUG DEVELOPMENT
1

COMPUTER AIDED DRUG DEVELOPMENT
INTRODUCTION TO STATISTICAL
MODELING
INTRODUCTION
STATISTICAL ANALYSIS
STATISTICAL MODELING
APPLICATION
ADVANTAGES & DISADVANTAGES
2

The major challenge that the pharmaceutical industry is facing in the discovery and
development of new drugs is to reduce costs and time needed from discovery to
market, while at the same time raising standards of quality.
The standard way to discover new drugs is essentially by trial and error.
The development of models in the pharmaceutical industry is certainly one of the
significant breakthroughs proposed to face the challenges of cost, speed, and quality.

The concept of adapting just another new technology is known as modeling.
INTRODUCTION
3

It's the science of collecting, exploring and presenting large amounts of data to
discover underlying patterns and trends.
Statistics are applied every day in research & industry
For example:
Manufacturers use statistics to weave threads into beautiful fabrics, to bring lift to
the airline industry and to help guitarists make beautiful music.
Researchers keep children healthy by using statistics to analyze data from the
production of viral vaccines, which ensures consistency and safety.
Communication companies use statistics to optimize network resources, improve
service and reduce customer churn by gaining greater insight into subscriber
requirements.
Statistical analysis
4

STATISTICAL MODELING
An introduction to statistical modeling is pivotal for any data analyst to make sense of
the data and make scientific predictions.
In its essence, statistical modeling is a process using statistical models to analyze a set
of data.
Statistical models are mathematical representations of the observed data.
Statistical modeling methods are a powerful tool in understanding the consolidated
data and making generalized predictions using this data.
A statistical model could be in the form of a mathematical equation or a visual
representation of the information.
5

TYPES OF STATISTICAL MODELING
The different types of statistical models are essentially the statistical methods used for
computation.
Some of them are:
Linear regression
Logistic regression
Cluster analysis
Factor analysis
Analysis of variation (ANOVA)
Chi-squared test
Time series
Experimental design
6

Statistical modeling plays an important role in all types of data analysis, making it
relevant to various fields of science and industry. This especially holds in the data
analytics field, where analysts rely heavily on statistical methods and techniques to
interpret and draw conclusions from any given dataset.
Statistical modeling in pharmaceutical research and development
Statistical models are being introduced into the pharmaceutical industry to
determine the efficacy of drugs for particular individuals, ensuring that individuals
are given the right drugs for optimal response.
Statistical techniques are used to filter biomarkers from the data, using which
models are developed to predict the groups in which the drugs are most effective.
APPLICATIONS OF Statistical MODELING
7

Statistical modeling in R
Owing to the extensive usage of statistical modeling in data science, convenient tools
are embedded within the R programming language. R allows analysts to run various
statistical models and is built specifically for statistical analysis and data mining. It can
also enable the analyst to create software and applications that allow for reliable
statistical analysis. Its graphical interface is also beneficial for data clustering, time-
series, lineal modeling, etc.
Statistical modeling in Excel
Excel can be used conveniently for statistical analysis of basic data. It may not be ideal
for huge sets of data, where R and Python work seamlessly. Microsoft Excel provides
several add-in tools under the Data tab. Enabling the Data Analysis tool on Excel opens a
wide range of convenient statistical analysis options, including descriptive analysis,
ANOVA, average, regression, and sampling.
8

DISADVANTAGES:
Misinterpretation, assumptions, complexity, require
expertise.
potential for erroneous conclusions with improper
statistical methods, leading to unethical practices.
Additionally, the development of new statistical methods is
often prioritized in research, creating a "publish or perish"
culture that may bias studies towards favoring new methods
over existing ones.
This emphasis on novelty can hinder meaningful
comparisons between different statistical approaches,
making it challenging for end-users to determine the most
suitable method for their research questions.
ADVANTAGES:
Provide insights
support decision-making
quantify uncertainty
detect patterns.
provides a structured approach
for collecting, organizing,
analyzing, and interpreting data
ADVANTAGES & DISADVANTAGES
9

COMPUTER AIDED DRUG DEVELOPMENT
DESCRIPTIVE VS MECHANIstic
MODEL
DESCRIPTIVE MODEL
MECHANISTIC MODEL
PURPOSE OF MODELING
DESCRIPTIVE VS MECHANISTIC MODELING
10

DESCRIPTIVE MODEL
In this type of model, the purpose is to provide a reasonable description of the data in some
appropriate way without any attempt at understanding the underlying phenomenon, that is the
data-generating mechanism, then the family of models is selected based on its adequacy to
represents the data structure.
In this instance, the order of the model is chosen based on its competence to describe the data
arrangement.
This type of model is very useful for discriminating between alternating hypothesis but can’t be
used for capturing the fundamental characteristics of a mechanism.
11

Mechanistic MODEL
Mechanistic models are incredibly precise and are structured to mirror equivalent variables in the
real system they’re representing.
The dynamics of the model’s behavior are also informed by the mechanism of the real system, with
inputs and outputs related by mathematical equations.
The models are useful for running experiments, making unique predictions, and closely inspecting
the inner workings of a mechanism.
12

To translate the known properties as well as some new hypothesis into a mathematical
representation.
The family of models is selected depending on the main purpose of the exercise.
If the purpose is just to provide a reasonable description of the data without any
attempt at understanding the underlying phenomenon, that is, the data-generating
mechanism, then the family of models is selected based on its adequacy to represent
the data structure.
Purpose of modeling
13

MECHANISTIC/MATHEMATICAL MODELING DESCRIPTIVE/ STATISTICAL MODELING
Study of concepts(in space & time)
quantity(descrete & continuous)
structure(geometric figures)
patterns
To decide on suitable course of action, it deals
with(in space & time)
collection and analysis of data
extracting information
Mechanistic or perspective models(primary
causation) of how system changes
Descriptive or phenomenological models
(primary correlation;traditional approach)
Some precision(limited data are used) Precise(exact data are used)
Realism: Explicity considering the process that
produce given observation or changes in the
system.
Little realism: make no claim about the nature
of the underlying mechanisms that produce
the behavior of the system
Study dynamics of interacting populations
using probablistic models
Attempts at estimating probabilistic future
behavior of a system based on its past
behavior
14

COMPUTER AIDED DRUG DEVELOPMENT
STATISTICAL PARAMETER ESTIMATION
STATISTICAL PARAMETERS
MEASURES OF CENTRAL TENDENCY
VARIANCE
STANDARD DEVIATION
15

The various statistical parameters are,
1. Measures of central tendency
2. Dispersion (also called Variability, Scatter, Spread)
3. Coefficient of Dispersion (COD)
4. Variance
5. Standard Deviation (SD) σ
6. Residual
7. Factor analysis
8. Absolute Error (AE)
9. Mean Absolute Error (MAE)
10. Percentage Error of Estimate (PE)
Statistical parameters
16

Measures of central tendency are also usually called as the averages.
They give us an idea about the concentration of the values in the central part of the
distribution.
The following are the five measures of central tendency that are in common use:
(i) Arithmetic mean,
(ii) Median,
(iii) Mode
measures of central tendency
17

MEAN The average of data
MEDIAN The middle value of data
MODE Most commonly occurring value
measures of central tendency
Mean (Average)
Mean locate the centre of distribution.
Also known as arithmetic mean Most Common Measure.
The mean is simply the sum of the values divided by the total number of items in the set.
Affected by Extreme Values.
X= ΣΧ/n
18

Median:
• The median is determined by sorting the data
set from lowest to highest values and taking
the data point in the middle of the sequence.
• Middle Value In Ordered Sequence
If Odd n, Middle Value of Sequence
If Even n, Average of 2 Middle Value
• Not Affected by Extreme Values
measures of central tendency
Mode:
• Measure of Central Tendency
• The mode is the most frequently occurring
value in the data set.
• May Be No Mode or Several Modes
• Mode is readily comprehensible and easy to
calculate.
• Mode is not at all affected by extreme values.
• Mode can be conveniently located even if the
frequency distribution has class intervals of
unequal magnitude
19

Variance:
• It is the expectation of the squared deviation of a random variable from its mean and
it informally measures how far a set of random numbers are spread out from the mean.
• It is calculated by taking the differences between each number in the set and the
mean, squaring the differences (to make them positive) and diving the sum of the
squares by the number of values in the set.
20

Standard Deviation (SD) : σ
It is a measure used to quantify the amount of variation or dispersion of a set of data
values.
It is a number that tells how measurement for a group are spread out from the average
(mean) or expected value.
A low standard deviation means most of the numbers are very close to the average while
a high value indicates the data to be spread out.
The SD provides the user with a numerical measure of the scattered data.
21
Dispersion:
It is the distance of scattered data from the mean or average value of data

Residuals:
It is the difference between the observed value of the dependent variable (y) and the
predicted value (y').
Each data point has one residual.
R = Observed Y value - Predicted Y value
Absolute Error (AE):
it is the magnitude of the difference between the exact value and the approximation.
The relative error is the absolute error divided by the magnitude of the exact value.
AE = X measured - X actual
Mean Absolute Error (MAE):
It is a quantity to measure how close forecasts or predictions are to the eventual
outcomes.
It is an average of the absolute errors.
The simplest measure of forecast accuracy is MAE.
The relative size of error is not always obvious.
22

COMPUTER AIDED DRUG DEVELOPMENT
CONFIDENCE REGIONS
INTRODUCTION
CONFIDENCE INTERVALS
GRAPH OF CONFIDENCE REGION
23

A confidence interval is a type of interval calculation in statistics derived from observed
data and holds the actual value of an unknown parameter.
It's linked to the confidence level, which measures how confident the interval is in
estimating the deterministic parameter.
A confidence interval shows the probability that a parameter will fall between a pair of
values around the mean.
Confidence intervals show the degree of uncertainty or certainty in a sampling method.
They are constructed using confidence levels of 95% or 99%.
Confidence intervals
24

The 95% confidence interval is the range that you can be 95% confident that the
similarly constructed intervals will contain the parameter being estimated.
Statisticians use confidence intervals to measure the uncertainty in a sample variable.
Confidence Interval Formula
The formula to find Confidence Interval is:
X is the sample mean.
Z is the number of standard deviations from the sample mean.
S is the standard deviation in the sample.
n is the size of the sample.
The value after the ± symbol is known as the margin of error
Confidence intervals
25

Confidence intervels
26

Previous year questions
1.Describe varoius statistical modeling in pharmaceutical research and development
(Jan2020) 10
2.Descriptive vs mechanistic modeling (Aug2018)(May2022) 5
3.Explain statistical modeling in pharmaceutical research and development (Sep2022) 5
4.State the statistical parameter and non-linearity in pharmaceutical research (Dec2023) 5
27

References
1.Computer applications in pharmaceutical research and development by Sean Ekins,
M.SC., PH.D., D.SC.,Wiley interscience publications 2006
2.Basic introduction to statistics, James Ed Muth et al.,
The Pharmaceutical Journal, PJ, December 2016, Vol 297, No 7896;297(7896):
DOI:10.1211/PJ.2016.20202033
3.https://www.sagepub.com/sites/default/files/upm-binaries/47775_ch_3.pdf
4.https://www.verisimlife.com/publications-blog/mechanistic-models-in-theory-and-in-
practice
5.https://gyansanchay.csjmu.ac.in/wp-content/uploads/2022/02/Statistical-analysis.pdf
28

COMPUTER AIDED DRUG DEVELOPMENT
Thank you