This is a lesson on tidying data. Specifically, what to do when a conceptual variable is spread out over 2 or more variables in a data frame.
Data used: words spoken by characters of different races and gender in the Lord of the Rings movie trilogy
tidyrpackage. Includes references, resources, and exercises.
tidy-datasub-directory of the Data Carpentry
tidyrpackage (only true dependency)
ggplot2is used for illustration but is not mission critical
reshape2are used in the bonus content
curlif you execute the code to grab the Lord of the Rings data used in examples from GitHub. Note that the files are also included in the
datacarpentry/data/tidy-datadirectory, so data download is avoidable.
xtableif you want to compile the