In this lab, you’ll build upon last week’s analysis about relationship between the median early career salary and percent of alumni who perceive their job as making the world a better place for colleges and universities in the United States. You will also do a short exercise about resolving merge conflicts and fill out a team agreement.
By the end of the lab you will be able to…
A repository has already been created for you and your teammates. Everyone in your team has access to the same repo.
Go to course organization on GitHub.
In addition to your private individual repositories, you should now see a repo named lab-04-. Go to that repository.
Each person on the team should clone the repository and open a new project in RStudio. Do not make any changes to the .Rmd file until the instructions tell you do to so.
Assign each person on your team a number 1 through 4. For teams of three, Member 1 can take on the role of Member 4.
The following exercises must be done in order. Only one person should type in the .Rmd file and push updates at a time. When it is not your turn to type, you should still share ideas and contribute to the team’s discussion.
You may have seen this already through the course of your collaboration in Lab 03. When two collaborators make changes to a file and push the file to their repository, git merges these two files.
If these two files have conflicting content on the same line, git will produce a merge conflict. Merge conflicts need to be resolved manually, as they require a human intervention:
To resolve the merge conflict, decide if you want to keep only your text, the text on GitHub, or incorporate changes from both texts. Delete the conflict markers <<<<<<<
, =======
, >>>>>>>
and make the changes you want in the final merge.
Assign numbers 1, 2, 3, and 4 to each of your team members (if only 3 team members, just number 1 through 3). Go through the following steps in detail, which simulate a merge conflict. Completing this exercise will be part of the lab grade.
Step 1: Everyone clone the lab-04- assignment repo in RStudio and open file merge-conflict.Rmd. Configure git if you haven’t already done so:
library(usethis)
use_git_config(user.name="your github username",
user.email="your email")
Member 4 should look at the group’s repo on GitHub.com to ensure that the other members’ files are pushed to GitHub after every step.
Step 2: Member 1 Change the team name to your team name. Knit, commit, and push.
Step 3: Member 2 Change the team name to something different (i.e., not your team name). Knit, commit, and push.
You should get an error.
Pull and review the document with the merge conflict. Read the error to your teammates. You can also show them the error by sharing your screen. A merge conflict occurred because you edited the same part of the document as Member 1. Resolve the conflict with whichever name you want to keep, then knit, commit and push again.
Step 4: Member 3 Write some narrative in the space provided. You should get an error.
This time, no merge conflicts should occur, since you edited a different part of the document from Members 1 and 2. Read the error to your teammates. You can also show them the error by sharing your screen.
Click to pull. Then, knit, commit, and push.
Please ask your TA if you have any questions about merge conflicts and collaborating in GitHub.
As you may have noticed from the merge conflict exercise, it is important to have good group communication. The purpose of the team agreement is to help you plan how you will work on assignments and communicate as a group outside of the lab sessions.
You can find the team agreement in the team-agreement file in your repo. Take a few minutes to discuss the items in the agreement.
Select one person from the team to type the group’s responses to the items in the team agreement.
Push the completed agreement to your GitHub repo. Each team member can refer to the document in this repo or download the PDF of the agreement for future reference.
We’ll use the following packages in this lab.
library(tidyverse)
library(knitr)
library(broom)
library(patchwork)
# add more packages as needed
Today’s data set is part of the TidyTuesday College tuition, diversity, and pay.
The information in this data set was collected from the PayScale College Salary Report.
variable | class | description |
---|---|---|
rank | double | Potential salary rank within state |
name | character | Name of school |
state_name | character | state name |
early_career_pay | double | Median salary for alumni with 0 - 5 years experience (in US dollars) |
mid_career_pay | double | Median salary for alumni with 0 - 5 years experience (in US dollars) |
make_world_better_percent | double | Percent of alumni who think they are making the world a better place |
stem_percent | double | Percent of degrees awarded in science, technology, engineering, or math subjects |
<- read_csv("data/alumni-salaries.csv") alumni
Is there a relationship the typical early career pay for alumni and the percent of alumni who perceive their job as making the world a better place? To answer this question, we will build upon the analysis from last week’s lab and as we use regression to predict the early career pay using the percent of alumni who perceive their job is making the world a better place.
Some observations have missing values for make_world_better_percent
. Before you get started, filter the alumni
data frame so you only have observations that have data for both make_world_better_percent
and early_career_pay
.
Team Member 1: Type the team’s responses to exercises 1 - 2.
Fit a linear model that can be used to predict the median early career pay based on the percent who perceive their job as making the world a better place. Display the model output using 3 digits for numerical values.
Calculate the predicted values and residuals from your model and save these results in a data frame. Print the first five rows of the new data frame.
✅ ⬆️ Team Member 1: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercises 1- 2.
Team Member 2: It’s your turn! Type the team’s response to exercise 3.
✅ ⬆️ Team Member 2: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercise 3.
Team Member 3: It’s your turn! Type the team’s response to exercises 4 - 5.
✅ ⬆️ Team Member 3: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the responses to exercise 4 - 5.
Team Member 4: It’s your turn! Type the team’s response to exercise 6.
✅ ⬆️ Team Member 4: Knit, commit and push your changes to GitHub with an appropriate commit message again. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
All other team members: Pull to get the updated documents from GitHub. Click on the .Rmd file, and you should see the team’s completed lab!
Go back through your write up to make sure you followed the coding style guidelines we discussed in class (e.g. no long lines of code)
Team Member 3: Make any edits as needed. Then knit, commit, and push the updated documents to GitHub if you made any changes.
All other team members can click to pull the finalized document.
Team Member 4: Upload the team’s PDF to Gradescope. Be sure to include every team member’s name in the Gradescope submission Associate the “Overall” graded section with the first page of your PDF, and mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages.
There should only be one submission per team on Gradescope.
Merge conflict notes and exercise from Data Science in a Box.