The ASTA team
Usual setup:
Assume that the mean is shifted by \(c\times\sigma\).
What is the probability of NOT getting an immediate alarm?
This is also called a type II error.
library(qcc)
data(pistonrings)
diam <- pistonrings$diameter
phaseI <- matrix(diam[1:125],25,byrow=TRUE)
phaseII <- matrix(diam[126:200],15,byrow=TRUE)
xbcc <- qcc(phaseI, std.dev = "UWAVE-SD", type = "xbar", plot = FALSE,
newdata = phaseII, title = "qq chart: pistonrings")
oc.curves(xbcc)
For the actual sample size (n=5) and in case of a process shift c=2, the error probability is around 7%.
If we increase sample size to n=10 then we are almost sure to immediate detection of a shift by \(2\sigma\).
The CUSUM chart is generally better than the xbar chart for detecting small shifts in the mean of a process.
Consider the standardized residuals \[z_i=\frac{\sqrt{n}(x_i-\hat{\mu}_0)}{\hat{\sigma}}\] where
For an in-control proces the cumulative sum (CUSUM) of residuals should vary around \(zero\).
The CUSUM chart is developed to see, if there is a drift away from zero. To that end we define 2 processes controlling for downward(D) respectively upward(U) drift: \begin{align*} D(i)&=\max\{0,D(i-1)-k-z_i\}\\ U(i)&=\max\{0,U(i-1)+z_i-k\} \end{align*}where \(k\) is a positive number.
The process is considered out of control if \(D\) or \(U\) exceeds a limit \(h\).
In qcc
\(k\) and \(h\) is specified by the arguments:
se.shift
is \(k\), which has a default value of 1.decision.interval
is \(h\), which has a default value of 5.h <- cusum(phaseI, newdata = phaseII, title = "CUSUM chart: pistonrings")
The chart includes a plot of
The Exponentially Weighted Moving Average (EWMA) is a statistic for monitoring the process, which averages the data in a way that gives most weight to recent data.
The EWMA is formally defined by \[M_t=\lambda x_t+(1-\lambda)M_{t-1},\ \ t=1,2,\ldots,T\] where
The influence of \(x_t\) on \(M_{t+s}\) is of the order \((1-\lambda)^s\), i.e. exponentially decreasing, which explains the term “Exponentially Weighted”.
Estimated variance for the EWMA process
\[ \begin{aligned} \text{UCL:} \quad &M_0+k s_M\\ \text{CL :} \quad &M_0\\ \text{LCL:} \quad &M_0-k s_M \end{aligned} \]
Conventional choise of \(k\) is 3.
h <- ewma(phaseI, newdata = phaseII, title = "EWMA chart: pistonrings")
lambda
.nsigmas
.An outlier would deviate from the line and/or the x-mean, so we calculate
We expect that both \(Zy_x\) and \(Zx\) should be within the limits \(\pm2\). In order to get an overall teststatistic, we calculate \[T^2=\frac{m-1}{m-2}Zy_x^2+Zx^2\] where \(m\) is sample size.
Alternatively, one might interchange the role of \(x\) and \(y\). But that does not matter. Actually it holds that \[T^2=\frac{m-1}{m-2}Zy_x^2+Zx^2=\frac{m-1}{m-2}Zx_y^2+Zy^2\]
load(url("https://asta.math.aau.dk/datasets?file=T2Example.RData"))
head(X$X1, 3)
## [,1] [,2] [,3] [,4]
## [1,] 72 84 79 49
## [2,] 56 87 33 42
## [3,] 55 73 22 60
head(X$X2, 3)
## [,1] [,2] [,3] [,4]
## [1,] 23 30 28 10
## [2,] 14 31 8 9
## [3,] 13 22 6 16
X
is a listX$X1
is a matrix, where the rows are samples of variable X1
Similarly X$X2
has samples for variable X2
h <- mqcc(X, type = "T2")
When the parameters are estimated from phase I samples, then the reference distribution is not chi-square, but rather a scaled F-distribution.
ellipseChart(h, show.id = TRUE)
Observation 10 deviates a lot from the regression line.
The acceptance area is an ellipse.
Set-up:
We term this a \((N,n,c)\) sampling plan. Possible reasons for acceptance sampling:
The correct sampling distribution of \(X\) is the so called hypergeometric distribution with parameters \((N,n,p)\).
If \(N>>n\), then the sampling distribution of \(X\) is well approximated by the simpler binomial distribution with parameters \((n,p)\).
If \(N>>n\) and \(p\) is smal, then the sampling distribution of \(X\) is well approximated by the much simpler poisson distribution with parameter \(np\).
For a given sampling plan \((N,n,c)\) the probability of accepting the lot depends on
We can use the function OC2c
in the package AcceptanceSampling
to determine these probabilities.
Sampling plan: \((1000,100,2)\)
library(AcceptanceSampling)
OCbin <- OC2c(100, 2) #default binomial
OCpoi <- OC2c(100, 2, type = "poisson")
OChyp <- OC2c(100, 2, type = "hypergeom", N = 1000)
par(mfrow=c(2,2)) #division of plot window
plot(1:10, type = "n", axes = FALSE, xlab = "", ylab = "")
text(4, 5, "(N,n,c)\n(1000,100,2)", cex = 2)
xl <- c(0, 0.1)
plot(OChyp, xlim = xl, main = "hyper")
plot(OCbin, xlim = xl, main = "binom")
plot(OCpoi, xlim = xl, main = "poiss")
Suppose that \(N\) is fixed. If we specify 2 points on the OC curne, then this determines \((n,c)\).
PRP: Producer Risk Point with coordinates (p1,q1): p1 is a fraction of defectives, that the producer finds acceptable, e.g. p1=0.01. The corresponding probability q1 of accept should then be high, e.g. q1=0.95.
CRP: Consumer Risk Point with coordinates (p2,q2): p2\(>\)p1 is a fraction of defectives, that the consumer finds unacceptable, e.g. p2=0.05. The corresponding probability q2 of accept should then be low,e.g. q2=0.01.
find.plan(PRP = c(.01,.95), CRP = c(.05,.01))[1:2]
## $n
## [1] 259
##
## $c
## [1] 5
Default assumption is binomial sampling.
plan <- find.plan(c(.01,.95), c(.05,.01), type = "hyp", N = 200)[1:2]
plan
## $n
## [1] 121
##
## $c
## [1] 2
OChyp <- OC2c(plan$n, plan$c, type = "hyp", N = 200, pd = c(.01,.05))
attr(OChyp, "paccept")
## [1] 1.000000000 0.009609746
We cannot have an exact match of the required values \((0.95,0.01)\) since \(n\) and \(c\) must be integers.
Let \(0\leq c_1 < r_1\) be integers.
Furthermore, \(c_2\) is an integer such that \(c_1<c_2\).
This is known as a double sampling plan.
Determining the OC curve of a double sampling plan requires input of \(n=c(n_1,n_2)\), \(c=c(c_1,c_2)\) and \(r=c(r_1,r_2)\), where \(r_2=c_2+1\).
x <- OC2c(c(125,125), c(1,4), c(4,5), pd = seq(0,0.1,0.001))
x
## Acceptance Sampling Plan (binomial)
##
## Sample 1 Sample 2
## Sample size(s) 125 125
## Acc. Number(s) 1 4
## Rej. Number(s) 4 5
plot(x)