Lab 04: Back to alumni jobs!

due Sun, Feb 21 at 11:59p EST

In this lab, you’ll build upon last week’s analysis about relationship between the median early career salary and percent of alumni who perceive their job as making the world a better place for colleges and universities in the United States. You will also do a short exercise about resolving merge conflicts and fill out a team agreement.

Learning goals

By the end of the lab you will be able to…

Getting started

Clone assignment repo + start new project

A repository has already been created for you and your teammates. Everyone in your team has access to the same repo.

Workflow: Using git and GitHub as a team

Assign each person on your team a number 1 through 4. For teams of three, Member 1 can take on the role of Member 4.

The following exercises must be done in order. Only one person should type in the .Rmd file and push updates at a time. When it is not your turn to type, you should still share ideas and contribute to the team’s discussion.

Merge Conflicts (uh oh)

You may have seen this already through the course of your collaboration in Lab 03. When two collaborators make changes to a file and push the file to their repository, git merges these two files.

If these two files have conflicting content on the same line, git will produce a merge conflict. Merge conflicts need to be resolved manually, as they require a human intervention:

To resolve the merge conflict, decide if you want to keep only your text, the text on GitHub, or incorporate changes from both texts. Delete the conflict markers <<<<<<<, =======, >>>>>>> and make the changes you want in the final merge.

Assign numbers 1, 2, 3, and 4 to each of your team members (if only 3 team members, just number 1 through 3). Go through the following steps in detail, which simulate a merge conflict. Completing this exercise will be part of the lab grade.

Resolving a merge conflict

Step 1: Everyone clone the lab-04- assignment repo in RStudio and open file merge-conflict.Rmd. Configure git if you haven’t already done so:

library(usethis)
use_git_config(user.name="your github username", 
               user.email="your email")

Member 4 should look at the group’s repo on GitHub.com to ensure that the other members’ files are pushed to GitHub after every step.

Step 2: Member 1 Change the team name to your team name. Knit, commit, and push.

Step 3: Member 2 Change the team name to something different (i.e., not your team name). Knit, commit, and push.

You should get an error.

Pull and review the document with the merge conflict. Read the error to your teammates. You can also show them the error by sharing your screen. A merge conflict occurred because you edited the same part of the document as Member 1. Resolve the conflict with whichever name you want to keep, then knit, commit and push again.

Step 4: Member 3 Write some narrative in the space provided. You should get an error.

This time, no merge conflicts should occur, since you edited a different part of the document from Members 1 and 2. Read the error to your teammates. You can also show them the error by sharing your screen.

Click to pull. Then, knit, commit, and push.

Please ask your TA if you have any questions about merge conflicts and collaborating in GitHub.

Team agreement

As you may have noticed from the merge conflict exercise, it is important to have good group communication. The purpose of the team agreement is to help you plan how you will work on assignments and communicate as a group outside of the lab sessions.

Packages + data

Packages

We’ll use the following packages in this lab.

library(tidyverse)
library(knitr)
library(broom)
library(patchwork)
# add more packages as needed

Data: Alumni jobs

Today’s data set is part of the TidyTuesday College tuition, diversity, and pay.

The information in this data set was collected from the PayScale College Salary Report.

variable class description
rank double Potential salary rank within state
name character Name of school
state_name character state name
early_career_pay double Median salary for alumni with 0 - 5 years experience (in US dollars)
mid_career_pay double Median salary for alumni with 0 - 5 years experience (in US dollars)
make_world_better_percent double Percent of alumni who think they are making the world a better place
stem_percent double Percent of degrees awarded in science, technology, engineering, or math subjects
alumni <- read_csv("data/alumni-salaries.csv")

Exercises

Is there a relationship the typical early career pay for alumni and the percent of alumni who perceive their job as making the world a better place? To answer this question, we will build upon the analysis from last week’s lab and as we use regression to predict the early career pay using the percent of alumni who perceive their job is making the world a better place.

Some observations have missing values for make_world_better_percent. Before you get started, filter the alumni data frame so you only have observations that have data for both make_world_better_percent and early_career_pay.

Team Member 1: Type the team’s responses to exercises 1 - 2.

  1. Fit a linear model that can be used to predict the median early career pay based on the percent who perceive their job as making the world a better place. Display the model output using 3 digits for numerical values.

  2. Calculate the predicted values and residuals from your model and save these results in a data frame. Print the first five rows of the new data frame.

✅ ⬆️ Team Member 1: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercises 1- 2.

Team Member 2: It’s your turn! Type the team’s response to exercise 3.

  1. Before using the model for prediction, let’s check the model conditions. For each condition (linearity, constant variance, normality, independence), indicate whether it is satisfied and briefly explain your reasoning. Show any plots, tables, and/or calculations used to support your reasoning.

✅ ⬆️ Team Member 2: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercise 3.

Team Member 3: It’s your turn! Type the team’s response to exercises 4 - 5.

  1. Sixty percent of the alumni at the Colorado School of Mines said their job makes the world a better place.
  1. Next, let’s consider how well the model fits the relationship between the early career pay and percent who perceive their job as making the world a better. Calculate \(R^2\) and interpret it in the context of the data.

✅ ⬆️ Team Member 3: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercise 4 - 5.

Team Member 4: It’s your turn! Type the team’s response to exercise 6.

  1. Do you think the model is useful for understanding and predicting the typical early career pay for alumni at a university? Briefly explain your reasoning.

✅ ⬆️ Team Member 4: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the team’s completed lab!

Wrapping up

Go back through your write up to make sure you followed the coding style guidelines we discussed in class (e.g. no long lines of code)

Team Member 3: Make any edits as needed. Then knit, commit, and push the updated documents to GitHub if you made any changes.

All other team members can click to pull the finalized document.

Submission

Team Member 4: Upload the team’s PDF to Gradescope. Be sure to include every team member’s name in the Gradescope submission Associate the “Overall” graded section with the first page of your PDF, and mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages.

There should only be one submission per team on Gradescope.




Merge conflict notes and exercise from Data Science in a Box.