Probability and Statistics for Engineering and Science, 9th Edition

Probability and Statistics for Engineering and Science, 9th Edition

Authors: Jay L. Devore

ISBN-13: 978-1305251809

See our solution for Question 38E from Chapter 12 from Devore's Probability and Statistics for Engineering and Science.

Problem 38E

Chapter:
Problem:
Refer to the data on x 5 liberation rate and..

Step-by-Step Solution

Step 1:

Given that, the data display the information of the values of variables ${\rm{N}}{{\rm{o}}_x}$ emission rate(y) measured in $\left( {\frac{{{\rm{MBtu}}}}{{{\rm{hr - f}}{{\rm{t}}^{\rm{2}}}}}} \right)$ and burner area liberation rate (x) measured in ppm. The variable burner area liberation rate is measured in (ppm).


 
Step 2:

a.

Linear regression model:

A linear regression model is $\hat y = {\hat \beta _0} + {\hat \beta _1}x$, where $\hat y$ is the predicted values of response variable and x is the predictor variable. The ${\hat \beta _0}$ denote the estimate of y-intercept of the line and ${\hat \beta _1}$ be the estimate of slope.

The y-intercept is computed as follows,

\[\begin{array}{c} {{\hat \beta }_0} = \bar y - {{\hat \beta }_1}\bar x\\ = \frac{{\sum\limits_i {{y_i}} - {{\hat \beta }_1}\sum\limits_i {{x_i}} }}{n} \end{array}\]

The slope coefficient of linear regression is computed as follows,

\[\begin{array}{c} {{\hat \beta }_1} = \frac{{{S_{xy}}}}{{{S_{xx}}}}\\ = \frac{{\left[ {\sum\limits_i {{x_i}{y_i}} - \frac{{\left( {\sum\limits_i {{x_i}} } \right) \times \left( {\sum\limits_i {{y_i}} } \right)}}{n}} \right]}}{{\sum\limits_i {x_i^2} - \frac{{{{\left( {\sum\limits_i {{x_i}} } \right)}^2}}}{n}}} \end{array}\]

The sample size is $n = 14$.

The below table is used for computation of slope and intercept:

x y x^2 y^2 xy
100 150 10000 22500 15000
125 140 15625 19600 17500
125 180 15625 32400 22500
150 210 22500 44100 31500
150 190 22500 36100 28500
200 320 40000 102400 64000
200 280 40000 78400 56000
250 400 62500 160000 100000
250 430 62500 184900 107500
300 440 90000 193600 132000
300 390 90000 152100 117000
350 600 122500 360000 210000
400 610 160000 372100 244000
400 670 160000 448900 268000
Sumx= 3300 Sumy=5010 Sumx^2=913750 Sumy^2=2207100 Sumxy=1413500

Thus, the slope coefficient of linear regression is:

\[\begin{array}{c} {{\hat \beta }_1} = \frac{{\left[ {\sum\limits_i {{x_i}{y_i}} - \frac{{\left( {\sum\limits_i {{x_i}} } \right) \times \left( {\sum\limits_i {{y_i}} } \right)}}{n}} \right]}}{{\sum\limits_i {x_i^2} - \frac{{{{\left( {\sum\limits_i {{x_i}} } \right)}^2}}}{n}}}\\ = \frac{{\left[ {1413500 - \frac{{3300 \times 5010}}{{14}}} \right]}}{{913750 - \frac{{{{\left( {3300} \right)}^2}}}{{14}}}}\\ = \frac{{232571.4286}}{{135892.8571}}\\ = 1.7114 \end{array}\]

Therefore, the point estimate of ${\hat \beta _1}$ is 1.7114.

Thus, the y-intercept is:

\[\begin{array}{c} {{\hat \beta }_0} = \frac{{\sum\limits_i {{y_i}} - {{\hat \beta }_1}\sum\limits_i {{x_i}} }}{n}\\ = \frac{{5010 - \left( {1.7114 \times 3300} \right)}}{{14}}\\ = \frac{{ - 637.62}}{{14}}\\ = - 45.5443 \end{array}\]

Therefore, the y-intercept is $ - 45.5443$ .

Therefore, the regression line for the variable’s emission rate and burner area liberation rate is:

\[\begin{array}{c} \hat y = {{\hat \beta }_0} + {{\hat \beta }_1}x\\ = - 45.5443 + 1.7114x \end{array}\]

Now, to test the hypothesis that there is a relationship between two rates.

The null and alternative hypothesis is defined as follows,

The null hypothesis is that there is no relationship between the variables liberation rate and ${\rm{N}}{{\rm{o}}_x}$ emission rate.

The alternative hypothesis is that there is useful relationship between the variables liberation rate and ${\rm{N}}{{\rm{o}}_x}$ emission rate.

That is,

\[\begin{array}{l} {H_0}:{\beta _1} = 0\\ {H_a}:{\beta _1} \ne 0 \end{array}\]

Here, the hypothesis is two tailed.


The test statistics is,

\[t = \frac{{{{\hat \beta }_1} - {\beta _1}}}{{{s_{{{\hat \beta }_1}}}}} \sim {t_{\alpha ,n - 2}}\]

Now, the estimate of error standard deviation of slope coefficient is computed as follows,

\[\begin{array}{c} {s_{{{\hat \beta }_1}}} = \frac{{{S_{yx}}}}{{\sqrt {{S_{xx}}} }}\\ = \frac{{\sqrt {\frac{{\sum\limits_i {y_i^2} - {{\hat \beta }_0}\sum\limits_i {{y_i}} - {{\hat \beta }_1}\sum\limits_i {{x_i}{y_i}} }}{n}} }}{{\sqrt {\sum\limits_i {x_i^2} - \frac{{{{\left( {\sum\limits_i {{x_i}} } \right)}^2}}}{n}} }}\\ = \frac{{\sqrt {\frac{{2207100 - \left( { - 45.5443 \times 5010} \right) - \left( {1.7114 \times 1413500} \right)}}{{14}}} }}{{\sqrt {913750 - \frac{{{{\left( {3300} \right)}^2}}}{{14}}} }}\\ = \frac{{\sqrt {1158.0745} }}{{\sqrt {135892.8571} }} \end{array}\] \[ = 0.0923\]

Therefore, the estimate of error standard deviation of slope coefficient is ${s_{{{\hat \beta }_1}}} = 0.0923$ .


Thus, the test statistics is:

\[\begin{array}{c} t = \frac{{{{\hat \beta }_1} - {\beta _1}}}{{{s_{{{\hat \beta }_1}}}}}\\ = \frac{{1.7114 - 0}}{{0.0923}}\\ = 18.5417 \end{array}\]

Therefore, the value of test statistics is 18.5417.


Now, the critical value of t is computed as follows:

The degree of freedom is,

\[\begin{array}{c} df = n - 2\\ = 14 - 2\\ = 12 \end{array}\]

The level of significance is,

\[\begin{array}{c} \alpha = 1 - 0.99\\ = 0.01\\ \frac{\alpha }{2} = \frac{{0.01}}{2}\\ = 0.005 \end{array}\]

Using t-table, the value of t at 0.005 level of significance and 12 degrees of freedom for two tailed is $ \pm 3.428$ .

Therefore, the critical value of t is $ \pm 3.428$.


Decision rule:

Reject the null hypothesis, if $\left| t \right| > {t_\alpha }$.

Fail to reject the null hypothesis, if $\left| t \right| \le {t_\alpha }$ .


Conclusion:

Here, $t = 18.5417 > 3.428$.

Thus, the decision is to reject the null hypothesis.

So, it can be concluded that reject the null hypothesis at 5% level of significance


Therefore, the is sufficient evidence to conclude that there is useful relationship between the variables liberation rate and ${\rm{N}}{{\rm{o}}_x}$ emission rate at 1% level of significance.


 
Step 3:

b.

The Confidence interval for the slope regression line is,

\[CI = {\hat \beta _1} \pm {t_{\frac{a}{2},n - 2}} \times {s_{{{\hat \beta }_1}}}\]

Here, the confidence interval for the expected change in ${\rm{N}}{{\rm{o}}_x}$ emission rate associated with 10 $\left( {\frac{{{\rm{MBtu}}}}{{{\rm{hr - f}}{{\rm{t}}^{\rm{2}}}}}} \right)$ is computed as follows:

\[\begin{array}{c} CI = 10 \times \left( {{{\hat \beta }_1} \pm {t_{\frac{a}{2},n - 2}} \times {s_{{{\hat \beta }_1}}}} \right)\\ = 10 \times \left( {1.7114 \pm {t_{\frac{a}{2},n - 2}} \times 0.0923} \right) \end{array}\]

Now, the critical value of t is computed as follows:

The sample size is $n = 17$.

The level of significance is,

\[\begin{array}{c} 1 - \alpha = 1 - 0.95\\ = 0.05\\ \frac{\alpha }{2} = \frac{{0.05}}{2}\\ = 0.025 \end{array}\]

The degree of freedom is,

\[\begin{array}{c} df = n - 2\\ = 14 - 2\\ = 12 \end{array}\]

Using the t-table, the critical value of t at 0.025 level of significance and 12 degree of freedom for right tailed is 2.560.

Thus,

\[\begin{array}{c} CI = 10 \times \left( {1.7114 \pm {t_{\frac{a}{2},n - 2}} \times 0.0923} \right)\\ = 10 \times \left( {1.7114 \pm 2.560 \times 0.0923} \right)\\ = 10 \times \left( {1.7114 \pm 0.2363} \right)\\ = \left( {10 \times \left( {1.7114 - 0.2363} \right),10 \times \left( {1.7114 + 0.2363} \right)} \right) \end{array}\] \[\begin{array}{c} = \left( {10 \times 1.4751,10 \times 1.9477} \right)\\ = \left( {14.751,19.477} \right) \end{array}\]

Therefore, the 95% confidence interval for the expected change in ${\rm{N}}{{\rm{o}}_x}$ emission rate associated

with 10 $\left( {\frac{{{\rm{MBtu}}}}{{{\rm{hr - f}}{{\rm{t}}^{\rm{2}}}}}} \right)$ is lies between 14.751 to 19.477.