Extended Comments: OpenIntro Stats

Chapter 1: Introduction to data .

Section 1.1: Using stents

The data sets referenced are {openintro::stent30} and {openintro:stent365}. The data in Figure 1.1 are stent30 and stent365 put together row-wise. There is no patient number in the data. 1. Activity: Center & Spread to show the individual cases. Use: Data and Point Plots activity

Section 1.2: Data basics

The data set in 1.2.1 is loan50. Remember to set n = All.

Use the Points and Density Little App to draw graphs of the variables pairwise. Emphasize the overall shapes of the four different possibilities:
1. quantitative vs quantitative: “a cloud”
2. quantitative vs categorical: “vertical bands”
3. categorical vs quantities: “horizontal bands”
4. categorical vs categorical : “boxes”
Add in a second explanatory variable, using color. Best to use variables with only a few levels, such as homeownership.

You can do much the same with the county data set. Again, set n = All.

The data in Exercise 1.12 (UN Votes) are from the {un_votes} package. Although not mentioned in the text, this is a good example of how data are stored as different files, each with it’s own unit of observation. - un_votes: a country voting on a resolution, a.k.a. rcid - un_roll_call_issues: a characterization of the issue relating to the vote. Unit of observation is an rcid - un_roll_calls: dates, sessions, amendments and other “details” of the vote. Unit of observation is an rcid.

Section 1.3: Sampling

We’re not yet at the point in the book where sampling variability becomes an issue. But this can be a good place to illustrate the difference between a simple random sample and a stratified sample.

Note the “stratified” check box on the data tab of the Little Apps. This stratifies the sample according to the explanatory variables selected. The effect is most evident when there are dramatically unequal numbers in the various levels of the explanatory variable.

Section 1.4: Experiments

There are no Little Apps relevant to designing an experiment.

Chapter 2: Summarizing data .

Activity: - Activity: Comparing Two Groups

Section 2.1: Examining numerical data

Center and spread Little App for the various statistics

Most any Little App can be used to show point plots (which they call scatter plots).

What’s normal? Little App to show shapes of distributions.

Section 2.2: Considering categorical data

We don’t do bar charts because we try to use a consistent display with x and y both variables, while the bar charts are intrinsically using one axis to display counts.

Chapter 3: Probability .

Nothing

Chapter 4: Distributions of random variables .

Activity: Shapes of distributions

Activity: Parameters and the Normal Distribution

What’s normal? Little App is useful for examining tail probabilities.

Chapter 5: Foundations for inference .

Activity: What is a Confidence Interval

5.1.3 They say, “The Central Limit Theorem is incredibly important, and it provides a foundation for much of statistics.” For myself, I think you can do statistics perfectly well without reference to this theorem, and certainly without confusing things with words like “central” and “limit.” No introductory statistics student would understand the proof of the theorem or even the conditions (e.g. finite variance) under which it applies. It would suffice to say that, in many situations, the normal distribution is a suitable approximation for a calculated value like a mean or proportion which is why you see it being used so much in those settings. Keep in mind that the normal approximation is often not an adequate approximation for data values themselves, just for means and proportions of large sets of data values.

5.1.6 The authors write, “The principles and general ideas are the same even if the details change a little.” Maybe it’s time to stop organizing chapters around “little” differences.

5.2: Confidence intervals for a proportion.

Note that the Confidence Interval & T little app, provides a straightforward way to demonstrate the reasoning behind accepting the result of confidence-interval calculations as meaningful. Here are the steps.

Open the Confidence Interval & T little app and choose a dataset of interest. It helps if the number of rows is large, say \(n > 1000\). You can generally find this out from the codebook. (Traditionally, the topic is introduced in a setting where there is no explanatory variable, the so-called “one-sample” statistic. But this is not required.)
Sett the sample size to “All” and display the mean of the response variable.
Freeze the display in (2).
Now set the sample size to something small, say less than 10% of the size of “All.” Display both the mean and the CI on the mean calculated from the small sample.
Note whether the mean of the “All” data falls within the CI of the sample. Press “New Sample” (the dice) many times and see how often the mean of “All” lies outside the CI. If the CI is correct, this should happen about 5% of the time, for a 95% confidence level.
Change the sample size to something smaller and note that the length of the CI is longer than in (5), yet the proportion of times the CI includes the “All” mean remains about 5%.

Your students will have an easier time counting if you set the confidence level to something like 80%. But don’t let them fall into the trap of thinking that the confidence level can be set as you like: Using 95% is the convention.

5.3 Hypothesis testing for a proportion

Chapter 6: Inference for categorical data .

Sections 6.1 and 6.2 introduce the formulas for calculating standard errors of proportions and differences in proportions. We suggest, instead, that you introduce the concept of an indicator variable, which is a quantitative presentation using 0-1 coding of the “success-failure” nature of categorical variables on which you would calculate a proportion. Using the 0-1 coding, calculating a proportion is the same as calculating a mean, so the methods of Chapter 5 directly apply.

Section 6.3 is “Testing for goodness of fit using chi-square.” We suggest skipping this material entirely. First, the standard description of chi-square as indicating “goodness of fit” is an awkward way of describing a situation where several proportions are being calculated and then compared to a theory of what those proportions should be. Second, the chi-square statistic is abstract and only an intermediate in the calculation of a p-value. Third, a p-value is a poor way to quantify “goodness,” and in general there is way too much emphasis on p-values in intro stats.

Proposal: Use likelihood for this purpose instead.

Section 6.4 is “Testing for independence in two-way tables.” Again, this use of chi-squared is traditional but, we think, not very helpful. Instead, pick one of the variables as the response, the other as the explanatory variable, and fit a model connecting the two. The situation anticipates ANOVA.

Note that an assumption behind our suggestion is that the response variable has just two levels, “success-failure” to use the language of the chapter. The large majority of the examples in the chapter satisfy this constraint.

Our suggestion is consistent with practice in contemporary research, where chi-square tests are hardly ever seen. Fitted models where the response variable has a success-failure configuration – typically based on logistic regression and incorporating covariates – is the standard. (See section 9.5.)

Chapter 7: Inference for numerical data .

Activity: Comparing Two Confidence Intervals

Activity: Comparing Two Groups

t-test app

Maybe compare ANOVA with the 2-way setting for chi-square?

Concept of power using t-test app and asking how often we’d reject the null.

Chapter 8: Introduction to linear regression .

Activity: Introducing Linear Regression

Activity: Describing Relationship Patterns in Words and Numbers

Activity: How much is explained?

Find data sets illustrating the types of outliers in section 8.3.

Inference: Use F.

Chapter 9: Multiple and logistic regression .

Seems odd that there’s not a section on covariates.