The ASTA team
We shall study 2 types of variation
Peter has done 100 independent measurements of the capacity of 4 of the displayed capacitors and one additional. Nominal values are 47, 47, 100, 150, 150. All with stated tolerance of 1%.
## capacity nomval sample
## 1 45.69 47 s_1_nF47
## 2 45.71 47 s_1_nF47
## 3 45.69 47 s_1_nF47
## 4 45.71 47 s_1_nF47
Here we see the first 4 capacity measurements of the first capacitor with nominal value 47.
##
## s_1_nF47 s_2_nF47 s_3_nF100 s_4_nF150 s_5_nF150
## 100 100 100 100 100
Linearisation:
\[\begin{align} f(x) &\approx f(x_0) + f'(x_0)(x - x_0) \\ \\ x_0 &= 1 \\ f(x) &= \log x \\ f'(x) &= 1/x \\ \\ x &= m/n \\ \log \left ( \frac{m}{n} \right ) &\approx \log 1 + \frac{1}{1} \left ( \frac{m}{n} - 1 \right ) \\ &= \frac{m - n}{n} \end{align}\]
\[ \log \left ( \frac{m}{n} \right ) \approx \frac{m - n}{n} \]
\[ \log \left ( \frac{m}{n} \right ) \approx \frac{m - n}{n} \]
Instead of the raw measurement we will consider:
\(\mbox{lnError = ln(measuredValue/nominalValue)}\)
Remark that by linear approximation:
\(\mbox{lnError}\approx\mbox{measuredValue/nominalValue - 1 = }\) \(\mbox{(measuredValue-nominalValue)/nominalValue}\)
which is the error relative to the nominal value.
I.e.: lnError
can be interpreted as relative error.
## capacity nomval sample lnError
## 1 45.69 47 s_1_nF47 -0.02826815
## 2 45.71 47 s_1_nF47 -0.02783051
## capacity nomval sample lnError
## 499 145.7 150 s_5_nF150 -0.02908558
## 500 145.6 150 s_5_nF150 -0.02977216
In this case we have as earlier mentioned two further sources of error:
\[\mbox{ln(measuredValue / nominalValue) = systematicError + productionError + measurementError}\]
We formulate the model:
where
This is the model treated in WMM chapter 13.11, where it is assumed that
The systematic error is simply estimated by the mean
## [1] -0.0288375
The meter systematically reports a value, which is estimated to be 2.88% too low.
Notation from WMM chapter 13.3:
Theorem 13.4 states:
## Analysis of Variance Table
##
## Response: lnError
## Df Sum Sq Mean Sq F value Pr(>F)
## sample 4 0.0046576 0.00116440 4067.4 < 2.2e-16 ***
## Residuals 495 0.0001417 0.00000029
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
where we read
Solving the equations
SSA = E(SSA)
and SSE = E(SSE)
yields
The estimated variance on log error
is clearly dominated by the production error.
We have the possibility of testing the hypothesis
This is equivalent to
Under \(H_0\) the statistic
has an F-distribution with degrees of freedom \((k-1,k(n-1))\)
In the actual case \(f_{obs}=4067.4\), which is highly significant (p-value=0).
In the preceeding we assumed normal errors after a log transformation.
Let \(X\) be a random variable and \(Y=ln(X)\).
We say that \(X\) has a lognormal distribution if \(Y\) has a normal distribution with - say - mean \(\mu\) and standard deviation \(\sigma\).
Density plots:
If \(Y=ln(X)\) has a normal distribution with mean \(\mu\) and standard deviation \(\sigma\), then Theorem 6.7 of WMM states:
If we are interested in relative variation, it is common to look at the coefficient of variation
if e.g. CV=0.05 then 95% of our measurements are within
i.e. most observations are within 10% of the mean.
If \(Y=ln(X)\) has a normal distribution with mean \(\mu\) and standard deviation \(\sigma\), we calculate CV for \(X\) as
In Peter’s data we estimated the variance of the log error to \(11.64 \times 10^{-6}\), which means that the estimated CV of the capacity measurement is
i.e., if we correct for the systematic error of the meter, then our measurements are extremely precise.
In our previous analysis, we assumed, that the systematic error on the meter did not depend on nominal value.
To check this assumption consider the model
where we have previously assumed slope(\(\beta\)) equal to 1.
##
## Call:
## lm(formula = log(capacity) ~ log(nomval), data = capDat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0064121 -0.0010784 0.0007315 0.0013879 0.0050839
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.0300145 0.0011907 -25.21 <2e-16 ***
## log(nomval) 1.0002636 0.0002648 3776.74 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.003101 on 498 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 1.426e+07 on 1 and 498 DF, p-value: < 2.2e-16
The slope is more than close to 1. But is actually extremely significantly different from 1 (tvalue=3776.74 >>>> 3).
Clearly, it is a bit dubious to assume a linear relationship, as we only have 3 nominal values.
If we stick to the linear calibration model, it is sensible to correct our measured errors according to the calibration of the meter:
## (Intercept) log(nomval)
## -0.03001454 1.00026359
## capacity nomval sample lnError lnError_c
## 1 45.69 47 s_1_nF47 -0.02826815 0.001745930
## 2 45.71 47 s_1_nF47 -0.02783051 0.002183452
## 3 45.69 47 s_1_nF47 -0.02826815 0.001745930
## 4 45.71 47 s_1_nF47 -0.02783051 0.002183452
## 5 45.70 47 s_1_nF47 -0.02804930 0.001964715
## 6 45.69 47 s_1_nF47 -0.02826815 0.001745930
The calibrated data now shows that the production error on component s_1_nF47 is in the vicinity of 0.2%. Well below the tolerance 1%.