Data Snooping, Part 2

What pitfalls lurk outside your database?

Credit: Tom Joy

In “Data Snooping Part 1” (Quality Digest, Aug. 6, 2018) we discovered the basis for the first caveat of data snooping. Here we discover three additional caveats of data snooping.

Last month we discovered:

Here we will use the data set from Part One to illustrate three additional caveats. The response variable Y represents the weekly steam usage for a chemical plant. X1 represents the amount of fatty acid in storage. X2 represents the amount of glycerin produced. X3 is the weekly number of hours of operation for the plant. (Last month an additional variable was included in the data set, but here we leave it out to illustrate what its absence does to our analysis.) As before, we use the first eight weeks of production as our baseline.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Data Snooping, Part 2

What pitfalls lurk outside your database?

Social Sharing block

Add new comment