Log transformation on the response
Log transformation on the predictor
A high respiratory rate can potentially indicate a respiratory infection in children. In order to determine what indicates a "high" rate, we first want to understand the relationship between a child's age and their respiratory rate.
The data contain the respiratory rate for 618 children ages 15 days to 3 years.
Variables:
Age
: age in monthsRate
: respiratory rate (breaths per minute)term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 47.052 | 0.504 | 93.317 | 0 | 46.062 | 48.042 |
Age | -0.696 | 0.029 | -23.684 | 0 | -0.753 | -0.638 |
^log(Y)=ˆβ0+ˆβ1X
If we apply a log transformation to the response variable, we want to estimate the parameters for the model...
^log(Y)=ˆβ0+ˆβ1X
We want to interpret the model in terms of y not log(Y), so we write all interpretations in terms of
y=exp{ˆβ0+ˆβ1X}=exp{ˆβ0}exp{ˆβ1X}
Suppose we have a set of values
x <- c(3, 5, 6, 8, 10, 14, 19)
Suppose we have a set of values
x <- c(3, 5, 6, 8, 10, 14, 19)
Let's calculate ¯log(x)
log_x <- log(x)mean(log_x)
## [1] 2.066476
Suppose we have a set of values
x <- c(3, 5, 6, 8, 10, 14, 19)
Let's calculate ¯log(x)
log_x <- log(x)mean(log_x)
## [1] 2.066476
Let's calculate log(ˉx)
xbar <- mean(x)log(xbar)
## [1] 2.228477
x <- c(3, 5, 6, 8, 10, 14, 19)
x <- c(3, 5, 6, 8, 10, 14, 19)
Let's calculate Median(log(x))
log_x <- log(x)median(log_x)
## [1] 2.079442
x <- c(3, 5, 6, 8, 10, 14, 19)
Let's calculate Median(log(x))
log_x <- log(x)median(log_x)
## [1] 2.079442
Let's calculate log(Median(x))
median_x <- median(x)log(median_x)
## [1] 2.079442
¯log(x)≠log(ˉx)
mean(log_x) == log(xbar)
## [1] FALSE
¯log(x)≠log(ˉx)
mean(log_x) == log(xbar)
## [1] FALSE
Median(log(x))=log(Median(x))
median(log_x) == log(median_x)
## [1] TRUE
Recall that y=β0+β1xi is the mean value of y at the given value xi. This doesn't hold when we log-transform y
The mean of the logged values is not equal to the log of the mean value. Therefore at a given value of x
exp{Mean(log(y))}≠Mean(y)⇒exp{β0+β1x}≠Mean(y)
exp{Median(log(y))}=Median(y)
exp{Median(log(y))}=Median(y)
Median(log(y))=Mean(log(y))
Median(ˆY)=exp{ˆβ0}exp{ˆβ1x}
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 3.845 | 0.013 | 304.500 | 0 | 3.82 | 3.870 |
Age | -0.019 | 0.001 | -25.839 | 0 | -0.02 | -0.018 |
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 3.845 | 0.013 | 304.500 | 0 | 3.82 | 3.870 |
Age | -0.019 | 0.001 | -25.839 | 0 | -0.02 | -0.018 |
Intercept: The median respiratory rate for a new born child is expected to be 46.759 (exp{3.845}) breaths per minute.
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 3.845 | 0.013 | 304.500 | 0 | 3.82 | 3.870 |
Age | -0.019 | 0.001 | -25.839 | 0 | -0.02 | -0.018 |
Intercept: The median respiratory rate for a new born child is expected to be 46.759 (exp{3.845}) breaths per minute.
Slope: For each additional month in a child's age, the respiratory rate is expected to multiply by a factor of 0.981 (exp{-0.019}).
ˆβj±t∗SE(^βj)
ˆβj±t∗SE(^βj)
exp{ˆβj±t∗SE(^βj)}
Age
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 3.845 | 0.013 | 304.500 | 0 | 3.82 | 3.870 |
Age | -0.019 | 0.001 | -25.839 | 0 | -0.02 | -0.018 |
We are 95% confident that for each additional month in age, the respiratory rate will multiply by a factor of 0.98 to 0.982 (exp{-0.02} to exp{-0.018}).
Try a transformation on X if the scatterplot shows some curvature but the variance is constant for all values of X
ˆY=ˆβ0+ˆβ1log(X)
ˆY=ˆβ0+ˆβ1log(X)
ˆY=ˆβ0+ˆβ1log(X)
Intercept: When log(X)=0, (X=1), Y is expected to be ˆβ0 (i.e. the mean of y is ˆβ0)
Slope: When X is multiplied by a factor of C, the mean of Y is expected to change by ˆβ1log(C) units
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 50.135 | 0.632 | 79.330 | 0 | 48.893 | 51.376 |
log_age | -5.982 | 0.263 | -22.781 | 0 | -6.498 | -5.467 |
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 50.135 | 0.632 | 79.330 | 0 | 48.893 | 51.376 |
log_age | -5.982 | 0.263 | -22.781 | 0 | -6.498 | -5.467 |
Intercept: The expected (mean) respiratory rate for children who are 1 month old (log(1) = 0) is 50.135 breaths per minute.
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 50.135 | 0.632 | 79.330 | 0 | 48.893 | 51.376 |
log_age | -5.982 | 0.263 | -22.781 | 0 | -6.498 | -5.467 |
Intercept: The expected (mean) respiratory rate for children who are 1 month old (log(1) = 0) is 50.135 breaths per minute.
Slope: If a child's age doubles, we expect their respiratory rate to decrease by 4.146 (-5.982*log(2)) breaths per minute.
See Log Transformations in Linear Regression for more details about interpreting regression models with log-transformed variables.
Log transformation on the response
Log transformation on the predictor
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |