Announcements:

Today’s topics are:

  1. Getting familiar with R
  2. Intro to R Markdown

To participate in today’s lecture, you should have:

1 Getting Familiar with R

1.1 Learning Objectives

By the end of today’s class, students are expected to be able to:

  • Have a sense of the capabilities, and pros and cons, of R
  • Write an R script to perform simple calculations
  • Describe the main idea of vectorization
  • Access the R documentation on an as-needed basis

1.2 Resources:

  • We’ll be roughly following the stat545.com: hello r page for exploring R.
  • adv-r: data structures is a more comprehensive version of exploring R objects.
    • For those already experience with R: I recommend you look this chapter over. You’ll probably learn something new.
    • For those new to R, who want the depth, read until the end of the “Vectors” section.

Want to practice programming in R? Check out R Swirl for interactive lessons.

1.3 About R

Why R? Some points taken from adv-r: intro:

  • Free, platform-wide
  • Open source
  • Comprehensive set of “add on” packages for analysis
  • Huge community

Strengths that tend to be specific to R:

  • Naturally handles data (and therefore its analysis)
  • Vectorization
  • ..

Its downfalls?

  • Slow
  • To learners, seems “quirky” (due to reliance on “metaprogramming”)
  • S3 OO objects & methods not always transparent

Alternatives for data analysis?

  • python is becoming more prominent in the data science community.
    • faster, more “all-purpose”
    • tends to handle text better
    • jupyter notebooks instead of RStudio and R Markdown

Methods for interacting with R (I’ll demo basic calculations)

  • Rstudio. NOTE: R is not RStudio! RStudio is an IDE.
    • The STAT 545 (and largely the world’s) choice!
  • “R console”
  • Terminal/ESS

1.4 Demonstration

Set up shop to follow along:

  1. Open RStudio.
  2. Either:
    • Ideally, download and open the template R script (if you’re online); OR
    • Start a new R Script. (File -> New File -> R Script)

To get participation marks for this activity:

  1. Work with me as we fill in the worksheet.
    • Remember, you’re not graded on correctness, just on the attempt!
  2. Upload your worksheet to your STAT 545 participation repo.

The completed worksheet can be found here.

2 R Markdown

2.1 Learning Objectives

  • Create an R Markdown (Rmd) document and render it to md/html/pdf
  • Write basic equations in Rmd using LaTeX
  • Style an Rmd document by editing the YAML header
  • Demonstrate at least two Rmd code chunk options
  • Informatively choose between writing an R script/Rmd/R Notebook.

2.2 Resources

Here are some resources that are well-aligned with today’s goals:

  • The stat545: Rmd test drive. Read this if you missed something from today’s class.
  • The Rmd website has a fantastic walk-through tutorial that gives a great overview of R Markdown. The following lessons parallel today’s content:
    • Lessons 1-4 (Intro - Inline Code)
    • Lessons 9-10 (Output Formats - Notebooks)

Here are resources that are great to reference:

To dig deeper into any (?) aspect of R Markdown, check out the free R Markdown book (also listed on the course syllabus).

2.3 Getting Started

You’ll need to install R Markdown. To do so, in any R console, run the following:

install.packages('rmarkdown')

2.4 Demonstration

Let’s play with Rmd. No need to submit anything for participation marks! We’ll do that next time when we explore data frames.

  1. Create a new Rmd document: File -> New File -> R Markdown…
  2. Title: “Data Frame Exploration”.
  • Notice the differences from regular markdown: YAML header, code chunks!
  1. Save it – I’ll call it cm003-exercise-df.
  2. Click Knit.
  3. Try other outputs.
  4. Try LaTeX equations.
  5. Try inserting new chunks of code.
  6. Try changing the theme with YAML.
  7. Try changing chunk options.
  8. Try making an R Notebook.
    • R script/Rmd/R Notebook – which to use?

3 To do before next class

To participate in tomorrow’s lecture, you should:

LS0tCnRpdGxlOiAiU1RBVCA1NDUgQ2xhc3MgTWVldGluZyAwMzogUiBhbmQgUm1kIgpvdXRwdXQ6CiAgICBodG1sX25vdGVib29rOgogICAgICAgIHRvYzogdHJ1ZQogICAgICAgIHRoZW1lOiBjZXJ1bGVhbgogICAgICAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQplZGl0b3Jfb3B0aW9uczogCiAgY2h1bmtfb3V0cHV0X3R5cGU6IGlubGluZQotLS0KCkFubm91bmNlbWVudHM6CgotIEhvbWV3b3JrIGhhcyBiZWVuIHJlbGVhc2VkLgogICAgLSBQdWJsaWMgcmVwby4KICAgIC0gTmVlZCB0byBoYXZlIHN1Ym1pdHRlZCB0aGUgc3VydmV5LgotIE5ldyBndWlkZWxpbmVzIGZvciBbcGFydGljaXBhdGlvbl0oaHR0cDovL3N0YXQ1NDUuY29tL0NsYXNzcm9vbS9wYXJ0aWNpcGF0aW9uLmh0bWwpIGFzc2Vzc21lbnQgLS0gbGluayBpcyBtYWRlIGF2YWlsYWJsZSBvbiB0aGUgY291cnNlIHN5bGxhYnVzLgotIE9uIG15IHJhZGFyOiBhc2tpbmcgZm9yIGhlbHAgZnJvbSBUQSdzIGluIGNsYXNzIQoKVG9kYXkncyB0b3BpY3MgYXJlOgoKMS4gR2V0dGluZyBmYW1pbGlhciB3aXRoIFIKMi4gSW50cm8gdG8gUiBNYXJrZG93bgoKVG8gcGFydGljaXBhdGUgaW4gdG9kYXkncyBsZWN0dXJlLCB5b3Ugc2hvdWxkIGhhdmU6CgotIFIgYW5kIFJTdHVkaW8gaW5zdGFsbGVkCi0gT3B0aW9uYWxseSwgTGFUZVggaW5zdGFsbGVkIGZvciBvdXRwdXR0aW5nIHRvIHBkZi4KLSBBIHBhcnRpY2lwYXRpb24gcmVwbyB0byBwdXQgaW4tY2xhc3Mgd29yawoKIyBHZXR0aW5nIEZhbWlsaWFyIHdpdGggUgoKIyMgTGVhcm5pbmcgT2JqZWN0aXZlcwoKQnkgdGhlIGVuZCBvZiB0b2RheSdzIGNsYXNzLCBzdHVkZW50cyBhcmUgZXhwZWN0ZWQgdG8gYmUgYWJsZSB0bzoKCi0gSGF2ZSBhIHNlbnNlIG9mIHRoZSBjYXBhYmlsaXRpZXMsIGFuZCBwcm9zIGFuZCBjb25zLCBvZiBSCi0gV3JpdGUgYW4gUiBzY3JpcHQgdG8gcGVyZm9ybSBzaW1wbGUgY2FsY3VsYXRpb25zCi0gRGVzY3JpYmUgdGhlIG1haW4gaWRlYSBvZiB2ZWN0b3JpemF0aW9uCi0gQWNjZXNzIHRoZSBSIGRvY3VtZW50YXRpb24gb24gYW4gYXMtbmVlZGVkIGJhc2lzCgojIyBSZXNvdXJjZXM6CgotIFdlJ2xsIGJlIHJvdWdobHkgZm9sbG93aW5nIHRoZSBbc3RhdDU0NS5jb206IGhlbGxvIHJdKGh0dHA6Ly9zdGF0NTQ1LmNvbS9ibG9jazAwMl9oZWxsby1yLXdvcmtzcGFjZS13ZC1wcm9qZWN0Lmh0bWwpIHBhZ2UgZm9yIGV4cGxvcmluZyBSLgotIFthZHYtcjogZGF0YSBzdHJ1Y3R1cmVzXShodHRwOi8vYWR2LXIuaGFkLmNvLm56L0RhdGEtc3RydWN0dXJlcy5odG1sKSBpcyBhIG1vcmUgY29tcHJlaGVuc2l2ZSB2ZXJzaW9uIG9mIGV4cGxvcmluZyBSIG9iamVjdHMuIAogICAgLSBGb3IgdGhvc2UgYWxyZWFkeSBleHBlcmllbmNlIHdpdGggUjogSSByZWNvbW1lbmQgeW91IGxvb2sgdGhpcyBjaGFwdGVyIG92ZXIuIFlvdSdsbCBwcm9iYWJseSBsZWFybiBzb21ldGhpbmcgbmV3LiAgIAogICAgLSBGb3IgdGhvc2UgbmV3IHRvIFIsIHdobyB3YW50IHRoZSBkZXB0aCwgcmVhZCB1bnRpbCB0aGUgZW5kIG9mIHRoZSAiVmVjdG9ycyIgc2VjdGlvbi4KCldhbnQgdG8gcHJhY3RpY2UgcHJvZ3JhbW1pbmcgaW4gUj8gQ2hlY2sgb3V0IFtSIFN3aXJsXShodHRwczovL3N3aXJsc3RhdHMuY29tLykgZm9yIGludGVyYWN0aXZlIGxlc3NvbnMuIAoKIyMgQWJvdXQgUgoKV2h5IFI/IFNvbWUgcG9pbnRzIHRha2VuIGZyb20gW2Fkdi1yOiBpbnRyb10oaHR0cDovL2Fkdi1yLmhhZC5jby5uei9JbnRyb2R1Y3Rpb24uaHRtbCk6CgotIEZyZWUsIHBsYXRmb3JtLXdpZGUKLSBPcGVuIHNvdXJjZQotIENvbXByZWhlbnNpdmUgc2V0IG9mICJhZGQgb24iIHBhY2thZ2VzIGZvciBhbmFseXNpcwotIEh1Z2UgY29tbXVuaXR5Ci0gLi4uCgpTdHJlbmd0aHMgdGhhdCB0ZW5kIHRvIGJlIHNwZWNpZmljIHRvIFI6IAoKLSBOYXR1cmFsbHkgaGFuZGxlcyBkYXRhIChhbmQgdGhlcmVmb3JlIGl0cyBhbmFseXNpcykKLSBWZWN0b3JpemF0aW9uCi0gLi4KCkl0cyBkb3duZmFsbHM/CgotIFNsb3cKLSBUbyBsZWFybmVycywgc2VlbXMgInF1aXJreSIgKGR1ZSB0byByZWxpYW5jZSBvbiAibWV0YXByb2dyYW1taW5nIikKLSBTMyBPTyBvYmplY3RzICYgbWV0aG9kcyBub3QgYWx3YXlzIHRyYW5zcGFyZW50Ci0gLi4uCgpBbHRlcm5hdGl2ZXMgZm9yIGRhdGEgYW5hbHlzaXM/CgotIHB5dGhvbiBpcyBiZWNvbWluZyBtb3JlIHByb21pbmVudCBpbiB0aGUgZGF0YSBzY2llbmNlIGNvbW11bml0eS4KICAgIC0gZmFzdGVyLCBtb3JlICJhbGwtcHVycG9zZSIKICAgIC0gdGVuZHMgdG8gaGFuZGxlIHRleHQgYmV0dGVyCiAgICAtIGp1cHl0ZXIgbm90ZWJvb2tzIGluc3RlYWQgb2YgUlN0dWRpbyBhbmQgUiBNYXJrZG93bgoKTWV0aG9kcyBmb3IgaW50ZXJhY3Rpbmcgd2l0aCBSIChJJ2xsIGRlbW8gYmFzaWMgY2FsY3VsYXRpb25zKQoKLSBSc3R1ZGlvLiBfX05PVEVfXzogUiBpcyBub3QgUlN0dWRpbyEgUlN0dWRpbyBpcyBhbiBfSURFXy4gCiAgICAtIFRoZSBTVEFUIDU0NSAoYW5kIGxhcmdlbHkgdGhlIHdvcmxkJ3MpIGNob2ljZSEKLSAiUiBjb25zb2xlIgotIFRlcm1pbmFsL0VTUwotIC4uLgoKIyMgRGVtb25zdHJhdGlvbgoKU2V0IHVwIHNob3AgdG8gZm9sbG93IGFsb25nOgoKMS4gT3BlbiBSU3R1ZGlvLgoyLiBFaXRoZXI6CiAgICAtIElkZWFsbHksIGRvd25sb2FkIGFuZCBvcGVuIHRoZSBbdGVtcGxhdGUgUiBzY3JpcHRdKGh0dHBzOi8vZ2l0aHViLmNvbS9TVEFUNTQ1LVVCQy9DbGFzc3Jvb20vYmxvYi9tYXN0ZXIvbm90ZXMvY20wMDMtZXhlcmNpc2UuUikgKGlmIHlvdSdyZSBvbmxpbmUpOyBPUgogICAgLSBTdGFydCBhIG5ldyBSIFNjcmlwdC4gKEZpbGUgLT4gTmV3IEZpbGUgLT4gUiBTY3JpcHQpCgpUbyBnZXQgcGFydGljaXBhdGlvbiBtYXJrcyBmb3IgdGhpcyBhY3Rpdml0eToKCjEuIFdvcmsgd2l0aCBtZSBhcyB3ZSBmaWxsIGluIHRoZSB3b3Jrc2hlZXQuCiAgICAtIFJlbWVtYmVyLCB5b3UncmUgbm90IGdyYWRlZCBvbiBjb3JyZWN0bmVzcywganVzdCBvbiB0aGUgYXR0ZW1wdCEKMi4gVXBsb2FkIHlvdXIgd29ya3NoZWV0IHRvIHlvdXIgU1RBVCA1NDUgcGFydGljaXBhdGlvbiByZXBvLgoKVGhlIGNvbXBsZXRlZCB3b3Jrc2hlZXQgY2FuIGJlIGZvdW5kIFtoZXJlXShodHRwczovL2dpdGh1Yi5jb20vU1RBVDU0NS1VQkMvQ2xhc3Nyb29tL2Jsb2IvbWFzdGVyL25vdGVzL2NtMDAzLWV4ZXJjaXNlLWNvbXBsZXRlLlIpLgoKIyBSIE1hcmtkb3duCgojIyBMZWFybmluZyBPYmplY3RpdmVzCgotIENyZWF0ZSBhbiBSIE1hcmtkb3duIChSbWQpIGRvY3VtZW50IGFuZCByZW5kZXIgaXQgdG8gbWQvaHRtbC9wZGYKLSBXcml0ZSBiYXNpYyBlcXVhdGlvbnMgaW4gUm1kIHVzaW5nIExhVGVYCi0gU3R5bGUgYW4gUm1kIGRvY3VtZW50IGJ5IGVkaXRpbmcgdGhlIFlBTUwgaGVhZGVyCi0gRGVtb25zdHJhdGUgYXQgbGVhc3QgdHdvIFJtZCBjb2RlIGNodW5rIG9wdGlvbnMKLSBJbmZvcm1hdGl2ZWx5IGNob29zZSBiZXR3ZWVuIHdyaXRpbmcgYW4gUiBzY3JpcHQvUm1kL1IgTm90ZWJvb2suCgojIyBSZXNvdXJjZXMKCkhlcmUgYXJlIHNvbWUgcmVzb3VyY2VzIHRoYXQgYXJlIHdlbGwtYWxpZ25lZCB3aXRoIHRvZGF5J3MgZ29hbHM6CgotIFRoZSBbc3RhdDU0NTogUm1kIHRlc3QgZHJpdmVdKGh0dHA6Ly9zdGF0NTQ1LmNvbS9ibG9jazAwN19maXJzdC11c2Utcm1hcmtkb3duLmh0bWwpLiBSZWFkIHRoaXMgaWYgeW91IG1pc3NlZCBzb21ldGhpbmcgZnJvbSB0b2RheSdzIGNsYXNzLgotIFRoZSBbUm1kIHdlYnNpdGVdKGh0dHBzOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tLykgaGFzIGEgZmFudGFzdGljIHdhbGstdGhyb3VnaCBbdHV0b3JpYWxdKGh0dHBzOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tL2xlc3Nvbi0xLmh0bWwpIHRoYXQgZ2l2ZXMgYSBncmVhdCBvdmVydmlldyBvZiBSIE1hcmtkb3duLiBUaGUgZm9sbG93aW5nIGxlc3NvbnMgcGFyYWxsZWwgdG9kYXkncyBjb250ZW50OgogICAgLSBMZXNzb25zIDEtNCAoSW50cm8gLSBJbmxpbmUgQ29kZSkKICAgIC0gTGVzc29ucyA5LTEwIChPdXRwdXQgRm9ybWF0cyAtIE5vdGVib29rcykKCkhlcmUgYXJlIHJlc291cmNlcyB0aGF0IGFyZSBncmVhdCB0byByZWZlcmVuY2U6CgotIFRoZSBvZmZpY2lhbCBbUm1kIGNoZWF0c2hlZXRdKGh0dHA6Ly93d3cucnN0dWRpby5jb20vd3AtY29udGVudC91cGxvYWRzLzIwMTYvMDMvcm1hcmtkb3duLWNoZWF0c2hlZXQtMi4wLnBkZikgaGFzIHNlZW1pbmdseSBldmVyeXRoaW5nIGJ1bmRsZWQgdG9nZXRoZXIgaW4gYSBjb25jaXNlIHdheS4KLSBBbiB1cGRhdGVkIGxpc3Qgb2Ygb3B0aW9ucyBmb3IgdGhlIFlBTUwgaGVhZGVyIGNhbiBiZSBmb3VuZCBpbiB0aGUgW3JtZCBib29rOiBodG1sLWRvY3VtZW50XShodHRwczovL2Jvb2tkb3duLm9yZy95aWh1aS9ybWFya2Rvd24vaHRtbC1kb2N1bWVudC5odG1sKQoKVG8gZGlnIGRlZXBlciBpbnRvIGFueSAoPykgYXNwZWN0IG9mIFIgTWFya2Rvd24sIGNoZWNrIG91dCB0aGUgZnJlZSBbUiBNYXJrZG93biBib29rXShodHRwczovL2Jvb2tkb3duLm9yZy95aWh1aS9ybWFya2Rvd24vKSAoYWxzbyBsaXN0ZWQgb24gdGhlIGNvdXJzZSBzeWxsYWJ1cykuCgojIyBHZXR0aW5nIFN0YXJ0ZWQKCllvdSdsbCBuZWVkIHRvIGluc3RhbGwgUiBNYXJrZG93bi4gVG8gZG8gc28sIGluIGFueSBSIGNvbnNvbGUsIHJ1biB0aGUgZm9sbG93aW5nOgoKYGBgCmluc3RhbGwucGFja2FnZXMoJ3JtYXJrZG93bicpCmBgYAoKIyMgRGVtb25zdHJhdGlvbgoKTGV0J3MgcGxheSB3aXRoIFJtZC4gTm8gbmVlZCB0byBzdWJtaXQgYW55dGhpbmcgZm9yIHBhcnRpY2lwYXRpb24gbWFya3MhIFdlJ2xsIGRvIHRoYXQgbmV4dCB0aW1lIHdoZW4gd2UgZXhwbG9yZSBkYXRhIGZyYW1lcy4KCjEuIENyZWF0ZSBhIG5ldyBSbWQgZG9jdW1lbnQ6IEZpbGUgLT4gTmV3IEZpbGUgLT4gUiBNYXJrZG93bi4uLgoyLiBUaXRsZTogIkRhdGEgRnJhbWUgRXhwbG9yYXRpb24iLgoKLSBOb3RpY2UgdGhlIGRpZmZlcmVuY2VzIGZyb20gcmVndWxhciBtYXJrZG93bjogWUFNTCBoZWFkZXIsIGNvZGUgY2h1bmtzIQoKMy4gU2F2ZSBpdCAtLSBJJ2xsIGNhbGwgaXQgYGNtMDAzLWV4ZXJjaXNlLWRmYC4KNC4gQ2xpY2sgYEtuaXRgLgo1LiBUcnkgb3RoZXIgb3V0cHV0cy4KNi4gVHJ5IExhVGVYIGVxdWF0aW9ucy4KNy4gVHJ5IGluc2VydGluZyBuZXcgY2h1bmtzIG9mIGNvZGUuCjguIFRyeSBjaGFuZ2luZyB0aGUgdGhlbWUgd2l0aCBZQU1MLgo5LiBUcnkgY2hhbmdpbmcgY2h1bmsgb3B0aW9ucy4KMTAuIFRyeSBtYWtpbmcgYW4gUiBfTm90ZWJvb2tfLiAKICAgIC0gUiBzY3JpcHQvUm1kL1IgTm90ZWJvb2sgLS0gd2hpY2ggdG8gdXNlPwoKCiMgVG8gZG8gYmVmb3JlIG5leHQgY2xhc3MKClRvIHBhcnRpY2lwYXRlIGluIHRvbW9ycm93J3MgbGVjdHVyZSwgeW91IHNob3VsZDoKCi0gRW5zdXJlIHlvdSBbZm9sbG93IHRoZSBzdGVwc10oaHR0cDovL3N0YXQ1NDUuY29tL0NsYXNzcm9vbS9wYXJ0aWNpcGF0aW9uLmh0bWwpIHRvIHNldCB1cCB5b3VyICJwYXJ0aWNpcGF0aW9uIiByZXBvc2l0b3J5LgotIENvbXBsZXRlIHRoZSBbY291cnNlIHN1cnZleV0oaHR0cHM6Ly9nb28uZ2wvZm9ybXMvVVB2UkE2YTlXUm9kOEpQYjIpLCBpZiB5b3UgaGF2ZW4ndCBhbHJlYWR5LgogICAgLSBZb3UgTVVTVCBkbyB0aGlzIGluIG9yZGVyIHRvIGRvIHlvdXIgZmlyc3QgYXNzaWdubWVudCEKLSBZb3UgbWlnaHQgd2FudCB0byB0cnkgaW5zdGFsbGluZyBMYVRlWCwgX2lmXyB5b3Ugd2FudCB0byBiZSBhYmxlIHRvIG91dHB1dCB0byBwZGYgYW5kIGFyZSBub3QgYWJsZSB0by4KCiAgICA=