Outliers are values that don’t “fit in” with the rest of the data. These extreme values are commonly considered a nuisance when we seek to summarize the data with our descriptive statistics. This article will show how to turn these nuisances into useful information.
ADVERTISEMENT |
The earliest statistical tests were ones for detecting outliers. The idea was that by deleting the outliers, we could compute “better” descriptive statistics for our data. As a result, we have generations of statisticians who have been taught to remove outliers prior to their analysis. After all, the theoretical underpinnings of our statistical computations don’t tell us how to deal with outliers. When our statistics are contaminated by outliers, they change the model used to describe the data. Therefore, we commonly remove the outliers to polish up the data so we can obtain useful and appropriate models.
…
Add new comment