Statistics Assignment Help : Question 1: (R/Rcmdr)
For US students, need help on this assignment upload it through our website www.mytutorstore.com or send through email at firstname.lastname@example.org
Question 1: (R/Rcmdr)
Old Faithful Geyser in Yellowstone National Park is renowned, among other things, for the regularity of its eruptions. The eruption durations (X, in minutes) and the subsequent intervals before the next eruption (Y, in minutes) are provided in a separate file.
(i) Make a scatterplot of the interval variable versus the duration variable. Describe the relationship. Is there an overall pattern? Do you see any deviation from that pattern?
(ii) Find the correlation coefficient R between interval and duration. What would happen to the value of R if the scales were transformed in hours for the interval and duration variables.
DURATION 1.0000000 0.8584273
INTERVAL 0.8584273 1.0000000
(iii) Identify the slope and the intercept of the regression line from the R Commander output and write the equation of the line.Make sure you write clearly the equation of the regression line. I suggest
[Name of response] = [intercept value] +[value of the slope]*[Name of predictor]
Interpret the slope in this context (if I increase blank1 by ….. I expect blank2 …)
lm(formula = INTERVAL ~ DURATION, data = Dataset)
Min 1Q Median 3Q Max
-14.644 -4.440 -1.088 4.467 15.652
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.8282 2.2618 14.96 <2e-16 ***
DURATION 10.7410 0.6263 17.15 <2e-16 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.683 on 105 degrees of freedom
Multiple R-squared: 0.7369, Adjusted R-squared: 0.7344
F-statistic: 294.1 on 1 and 105 DF, p-value: < 2.2e-16
(iv) In simple language, what is the slope of the line telling us?
(v) Add the regression line to the scatterplot.
(vi) Find the percent of variation in the interval variable that is explained by the model. Does the regression model provide a good fit?
To find the percent of variation etc:
o Well, it is the square of correlation coefficient r found in (iii). It is to be read as a percentage (consult the examples in the notes).
(vii) Make a residual plot from the linear regression model you constructed above. Discuss the appropriateness of the model.
(viii) Use the equation of the regression line to predict the subsequent interval before the next eruption for an eruption that lasted 5 minutes. How confident are you in the accuracy of your prediction?