The ASTA team
Usual setup:
\(m\) samples with sample size \(n\).
Process mean \(\mu\) and standard deviation \(\sigma\).
Sample means have standard deviation \(\frac{\sigma}{\sqrt{n}}\).
By default we use the 3*sigma rule.
Assume that the mean is shifted by \(c\times\sigma\).
What is the probability of NOT getting an immediate alarm?
This is also called a type II error.
For the actual sample size (n=5) and in case of a process shift c=2, the error probability is around 7%.
If we increase sample size to n=10 then we are almost sure to immediate detection of a shift by \(2\sigma\).
The CUSUM chart is generally better than the xbar chart for detecting small shifts in the mean of a process.
Consider the standardized residuals \[z_i=\frac{\sqrt{n}(x_i-\hat{\mu}_0)}{\hat{\sigma}}\] where
\(\hat{\mu}_0\) is an estimate of the process mean.
\(\hat{\sigma}\) is an estimate of the process standard deviation.
For an in-control proces the cumulative sum (CUSUM) of residuals should vary around \(zero\).
The CUSUM chart is developed to see, if there is a drift away from zero. To that end we define 2 processes controlling for downward(D) respectively upward(U) drift: \[\begin{align*} D(i)&=\max\{0,D(i-1)-k-z_i\}\\ U(i)&=\max\{0,U(i-1)+z_i-k\} \end{align*}\] where \(k\) is a positive number.
Interpretation of \[\begin{align*} D(i)&=\max\{0,D(i-1)-k-z_i\}\\ U(i)&=\max\{0,U(i-1)+z_i-k\} \end{align*}\]
If \(z_i< -k\), i.e. \(z_i\) is more that \(k\) below zero, then \(D\) is increased. If this happens a number of times in a row, then \(D\) grows “big”.
If \(z_i>k\), i.e. \(z_i\) is more that \(k\) above zero, then \(U\) is increased. If this happens a number of times in a row, then \(U\) grows “big”.
The process is considered out of control if \(D\) or \(U\) exceeds a limit \(h\).
In qcc
\(k\) and \(h\) is specified by the arguments:
se.shift
is \(k\), which has a default value of 1.
decision.interval
is \(h\), which has a default value of 5.
The chart includes a plot of
The Exponentially Weighted Moving Average (EWMA) is a statistic for monitoring the process, which averages the data in a way that gives most weight to recent data.
The EWMA is formally defined by \[M_t=\lambda x_t+(1-\lambda)M_{t-1},\ \ t=1,2,\ldots,T\] where
\(M_0\) is the mean of some historical data.
\(x_t\) is the measurement at time \(t\).
\(T\) is the length of the sampling period.
\(0 < \lambda \leq 1\) is a smoothing parameter, where \(\lambda=1\) corresponds to “no memory”.
The influence of \(x_t\) on \(M_{t+s}\) is of the order \((1-\lambda)^s\), i.e. exponentially decreasing, which explains the term “Exponentially Weighted”.
Estimated variance for the EWMA process
\[ \begin{aligned} \text{UCL:} \quad &M_0+k s_M\\ \text{CL :} \quad &M_0\\ \text{LCL:} \quad &M_0-k s_M \end{aligned} \]
Conventional choise of \(k\) is 3.
lambda
.nsigmas
.An outlier would deviate from the line and/or the x-mean, so we calculate
We expect that both \(Zy_x\) and \(Zx\) should be within the limits \(\pm2\). In order to get an overall teststatistic, we calculate \[T^2=\frac{m-1}{m-2}Zy_x^2+Zx^2\] where \(m\) is sample size.
Alternatively, one might interchange the role of \(x\) and \(y\). But that does not matter. Actually it holds that \[T^2=\frac{m-1}{m-2}Zy_x^2+Zx^2=\frac{m-1}{m-2}Zx_y^2+Zy^2\]
## [,1] [,2] [,3] [,4]
## [1,] 72 84 79 49
## [2,] 56 87 33 42
## [3,] 55 73 22 60
## [,1] [,2] [,3] [,4]
## [1,] 23 30 28 10
## [2,] 14 31 8 9
## [3,] 13 22 6 16
X
is a listX$X1
is a matrix, where the rows are samples of variable X1
Similarly X$X2
has samples for variable X2
When the parameters are estimated from phase I samples, then the reference distribution is not chi-square, but rather a scaled F-distribution.
Observation 10 deviates a lot from the regression line.
The acceptance area is an ellipse.
Set-up:
Production proces where we produce lots of size \(N\).
From the lot we take a sample of size \(n\).
If the number of defective items exceeds the number \(c\), then the lot is rejected.
We term this a \((N,n,c)\) sampling plan. Possible reasons for acceptance sampling:
Testing is destructive
The cost of 100% inspection is very high
100% inspection takes too long
Let \(X\) denote the number of defective items in the sample.
Let \(p\) denote the fraction of defective items in the lot.
The correct sampling distribution of \(X\) is the so called hypergeometric distribution with parameters \((N,n,p)\).
If \(N>>n\), then the sampling distribution of \(X\) is well approximated by the simpler binomial distribution with parameters \((n,p)\).
If \(N>>n\) and \(p\) is smal, then the sampling distribution of \(X\) is well approximated by the much simpler poisson distribution with parameter \(np\).
For a given sampling plan \((N,n,c)\) the probability of accepting the lot depends on
the fraction \(p\) of defective items in the lot
the assumed sampling distribution
We can use the function OC2c
in the package AcceptanceSampling
to determine these probabilities.
Sampling plan: \((1000,100,2)\)
Suppose that \(N\) is fixed. If we specify 2 points on the OC curne, then this determines \((n,c)\).
PRP: Producer Risk Point with coordinates (p1,q1): p1 is a fraction of defectives, that the producer finds acceptable, e.g. p1=0.01. The corresponding probability q1 of accept should then be high, e.g. q1=0.95.
CRP: Consumer Risk Point with coordinates (p2,q2): p2\(>\)p1 is a fraction of defectives, that the consumer finds unacceptable, e.g. p2=0.05. The corresponding probability q2 of accept should then be low,e.g. q2=0.01.
## $n
## [1] 259
##
## $c
## [1] 5
Default assumption is binomial sampling.
## $n
## [1] 121
##
## $c
## [1] 2
## [1] 1.000000000 0.009609746
We cannot have an exact match of the required values \((0.95,0.01)\) since \(n\) and \(c\) must be integers.
Let \(0\leq c_1 < r_1\) be integers.
Furthermore, \(c_2\) is an integer such that \(c_1<c_2\).
\(x_1\): number of defectives in an initial sample of size \(n_1\).
If \(x_1\leq c_1\) accept the lot.
If \(r_1\leq x_1\) reject the lot.
If \(c_1< x_1<r_1\): Take a second sample of size \(n_2\) and let \(x_2\) be the number of defectives.
If \(x_1+x_2\leq c_2\) accept the lot. Otherwise reject.
This is known as a double sampling plan.
Determining the OC curve of a double sampling plan requires input of \(n=c(n_1,n_2)\), \(c=c(c_1,c_2)\) and \(r=c(r_1,r_2)\), where \(r_2=c_2+1\).
## Acceptance Sampling Plan (binomial)
##
## Sample 1 Sample 2
## Sample size(s) 125 125
## Acc. Number(s) 1 4
## Rej. Number(s) 4 5