Due Friday 2016-November-18.
If you don’t feel like dreaming up your own thing, here’s a Gapminder blueprint that is a minimal but respectable way to complete the assignment. You are welcome to remix R code already written by someone, student or JB, in this class, but credit/link appropriately, e.g. in comments.
JB has provided a template, using a different dataset, 01_justR, that should help make this concrete.
Download the raw data for our example, gapminder.tsv.
Option 1: via an R script using download.file
Option 2: in a shell script using
curl -o gapminder.tsv https://raw.githubusercontent.com/jennybc/gapminder/master/inst/gapminder.tsv wget https://raw.githubusercontent.com/jennybc/gapminder/master/inst/gapminder.tsv
broommay be useful here.
Create a figure for each continent, and write one file per continent, with an informative name. The figure should give scatterplots of life expectancy vs. year, faceting on country, fitted line overlaid.
Write a master R script that simply
source()s the three scripts, one after the other. Tip: you may want a second “clean up / reset” script that deletes all the output your scripts leave behind, so you can easily test and refine your strategy, i.e. without repeatedly deleting stuff “by hand”. You can run the master script or the cleaning script from a shell with
Render your RMarkdown report generating Markdown and HTML using
Rscript -e "rmarkdown::render('myAwesomeReport.rmd')"
Rscript -e "rmarkdown::render('myAwesomeScript.R')"
Provide a link to a
README.md page that explains how your pipeline works and links to the remaining files. Your peers and the TAs should be able to go to this landing page and re-run your analysis quickly and easily.
Consider including an image showing a graphical view (the dependency diagram) of your pipeline using makefile2graph. On Mac or Linux you can install
makefile2graph using Homebrew or Linuxbrew with the command
brew install makefile2graph.
Follow the basic Gapminder blueprint above, but find a different data aggregation task, different panelling/faceting emphasis, focus on different variables, etc.
Use non-Gapminder data – like maybe the candy survey?
This means you’ll need to spend more time on data cleaning and sanity checking. You will probably have an entire script (or more!) devoted to data prep. Examples:
Experiment with running R code saved in a script from within R Markdown. Here’s some official documentation on code externalization.
Embed pre-existing figures in an R Markdown document, i.e. an R script creates the figures, then the report incorporates them. General advice on writing figures to file is here. See an example of this in an R Markdown file in one of the examples.
Import pre-existing data in an R Markdown document, then format nicely as a table.
Use Pandoc and/or LaTeX to explore new territory in document compilation. You could use Pandoc as an alternative to
knitr) for Markdown to HTML conversion; you’d still use
rmarkdown for conversion of R Markdown to Markdown. You would use LaTeX to get PDF output from Markdown.