class: center, middle, inverse, title-slide # Simple Linear Regression ## Inference ### Prof. Maria Tackett --- class: middle, center ## [Click for PDF of slides](05-slr-coef-inf.pdf) --- ## Topics -- - Conduct a hypothesis test for `\(\beta_1\)` -- <br> - Calculate a confidence interval for `\(\beta_1\)` --- ## Movie ratings data The data set contains the "Tomatometer" score (.term[critics]) and audience score (.term[audience]) for 146 movies rated on rottentomatoes.com. <img src="05-slr-coef-inf_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- ## The model ```r model <- lm(audience ~ critics, data = movie_scores) ``` ```r model %>% tidy() %>% kable(format = "html", digits = 3) ``` <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> --- ## The model `$$\color{blue}{\hat{\text{audience}} = 32.316 + 0.519 \times \text{critics}}$$` <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> <img src="05-slr-coef-inf_files/figure-html/unnamed-chunk-7-1.png" width="80%" style="display: block; margin: auto;" /> --- class: middle, center ### Does the data provide sufficient evidence that `\(\beta_1\)` is significantly different from 0? --- ## Outline of a hypothesis test -- 1️⃣ State the hypotheses. -- 2️⃣ Calculate the test statistic. -- 3️⃣ Calculate the p-value. -- 4️⃣ State the conclusion. --- ## 1️⃣ State the hypotheses <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> <br> .pull-left[ .small-box[ `$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$` ] ] --- ## 1️⃣ State the hypotheses <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> <br> .pull-left[ .small-box[ `$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$` ] ] .pull-right[ <font color = "white">place-holder</font> .big[.vocab[Null hypothesis]] ] --- ## 1️⃣ State the hypotheses <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> <br> .pull-left[ .small-box[ `$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$` ] ] .pull-right[ <font color = "white">place-holder</font> .big[.vocab[Null hypothesis]] .big[.vocab[Alternative hypothesis]] ] --- ## 2️⃣ Calculate the test statistic <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> critics </td> <td style="text-align:right;"> 0.519 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 15.028 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> <br> .eq[ `$$\text{test statistic} = \frac{\text{Estimate} - \text{Hypothesized}}{\text{Standard error}}$$` ] --- ## 2️⃣ Calculate the test statistic <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td> </tr> </tbody> </table> <br> .pull-left[ .eq[ `$$t = \frac{\hat{\beta}_1 - 0}{SE_{\hat{\beta}_1}}$$` ] ] -- .pull-right[ .small-box-work[ `$$\begin{aligned}t &= \frac{0.5187 - 0}{0.0345}\\ &= \mathbf{15.03}\end{aligned}$$` ] ] --- ## 3️⃣ Calculate the p-value <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td> </tr> </tbody> </table> <br> .eq[ `$$\text{p-value} = P(|t| \geq |\text{test statistic}|)$$` ] Calculated from a `\(t\)` distribution with `\(n-2\)` degrees of freedom --- ## 3️⃣ Calculate the p-value <img src="05-slr-coef-inf_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> --- ## Understanding the p-value | Magnitude of p-value | Interpretation | |:---------------------:|:-------------------------------------:| | p-value < 0.01 | strong evidence against `\(H_0\)` | | 0.01 < p-value < 0.05 | moderate evidence against `\(H_0\)` | | 0.05 < p-value < 0.1 | weak evidence against `\(H_0\)` | | p-value > 0.1 | effectively no evidence against `\(H_0\)` | <br> <br> *These are general guidelines. The strength of evidence depends on the context of the problem.* --- ## 4️⃣ State the conclusion <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td> </tr> </tbody> </table> <br> -- The data provide sufficient evidence that the population slope `\(\beta_1\)` is different from 0. .vocab[There is a linear relationship between the critics score and audience score for movies on rottentomatoes.com.] --- class: middle, center ### What is a plausible range of values for the population slope `\(\beta_1\)`? --- ## Confidence interval for `\(\beta_1\)` .eq[ `$$\text{ Estimate} \pm \text{ (critical value) } \times \text{SE}$$` ] -- .middle[ .eq[ `$$\hat{\beta}_1 \pm t^* \times SE_{\hat{\beta}_1}$$` ] ] <br> `\(t^*\)` is calculated from a `\(t\)` distribution with `\(n-2\)` degrees of freedom --- ## Calculating the 95% CI for `\(\beta_1\)` <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.316 </td> <td style="text-align:right;"> 2.343 </td> <td style="text-align:right;"> 13.795 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td> <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td> </tr> </tbody> </table> `$$\hat{\beta}_1 = 0.519 \hspace{15mm} t^* = 1.977 \hspace{15mm} SE_{\hat{\beta}_1} = 0.035$$` -- .eq[ `$$0.519 \pm 1.977 \times 0.035 \\[8pt] \mathbf{[0.450, 0.588]}$$` ] --- ## Interpretation .eq[ `$$\mathbf{[0.450, 0.588]}$$` ] -- <br> .vocab[We are 95% confident that for every one point increase in the critics score, the audience score is predicted to increase on average between 0.450 and 0.588 points.] --- ## Recap -- - Conducted a hypothesis test for `\(\beta_1\)` -- <br> - Calculated a confidence interval for `\(\beta_1\)`