Welcome to STA 371H! On the first day of class, we’ll give an overview of the course (the course syllabus is here.) As a gentle introduction to the initial material, we will talk about some basic principles that separate good statistical graphics from bad ones (slides here).

Video lectures

These follow Chapter 1 of the course packet exactly:

Reading

All readings are accessible through the Resources tab, above. For this week, please read the introduction and Chapter 1 of the course packet. The key topics are:

  • continuous and categorical/grouping variables
  • contingency tables
  • simple summaries and graphics: histogram, boxplot, dotplot, scatter plot, lattice plot
  • variation between and within groups.
  • variation among numerical variables.
  • multivariate plots

Optionally, you can also consult Chapter 1 of Kaplan. This is mainly useful as an introduction to R. Feel free to move rapidly if you’re feeling comfortable with the software.

Software

The first thing to do is to install R and then RStudio on your own computer. Detailed instructions for installing these two programs can be found here. Both are free.

R is the underlying data-analysis program we’ll use in this course, while RStudio provides a nice front-end interface to R that makes certain repetitive steps (e.g. loading data, saving plots) very simple. I will use RStudio in class most days this semester, and you will use it most weeks for your homework. RStudio depends upon having R available behind the scenes, so make sure you install both, even though you won’t need to interact directly with R.

Once you’ve installed R and RStudio, complete the following R walkthroughs. The first two are designed to get you off the ground, so if you’re familiar with R, you can safely skip these.

Supplemental reading

The material this week and next also coincides roughly with Chapter 1 of OpenIntro: Statistics. As with the Kaplan book, you should not feel obligated to read this, but it would be a good supplement for anyone looking for additional study materials or review of your first statistics class.

Additional practice with R

If you are feeling a little uncomfortable with the idea of R, do not worry. We will practice a lot in class. But if you’d like, you can get a jump start on things by following along with all the R commands in Chapters 1-3 of Kaplan. Just replicate exactly what he does in your own R session. Don’t just copy and paste; actually type the commands yourself! It’s the best way to learn them.

Exercises

There are no exercises assigned this week. The first set of exercises will be for next week (due Monday, Jan 30), and is posted here in case you want to get ahead of things.