Overview

You have learned alot about data wrangling! You know how to use the tidyverse to:

The goal of this homework is to solidify your data wrangling skills by working some realistic problems in the grey area between data aggregation and data reshaping.

If you internalize that there are multiple solutions to most problems, you will spend less time banging your head against the wall in data analysis. If something’s really hard, sneak up on it from a different angle.

Due anytime Tuesday October 11th 2016.

Choose your own adventure

Pick one of the data reshaping prompts and do it.

Pick a join prompt and do it.

It is fine to work with a new dataset and/or create variations on these problem themes.

General data reshaping and relationship to aggregation

Problem: You have data in one “shape” but you wish it were in another. Usually this is because the alternative shape is superior for presenting a table, making a figure, or doing aggregation and statistical analysis.

Solution: Reshape your data. For simple reshaping, gather() and spread() from tidyr will suffice. Do the thing that it possible / easier now that your data has a new shape.

Prompts:

Activity #1

Activity #2

Activity #3

Activity #4

Activity #5

Join, merge, look up

Problem: You have two data sources and you need info from both in one new data object.

Solution: Perform a join, which borrows terminology from the database world, specifically SQL.

Prompts:

Activity #1

Activity #2

You will likely need to iterate between your data prep and your joining to make your explorations comprehensive and interesting. For example, you will want a specific amount (or lack) of overlap between the two data.frames, in order to demonstrate all the different joins. You will want both the data frames to be as small as possible, while still retaining the expository value.

Activity #3