| Distribution | Mean | SD | T2B |
|---|---|---|---|
| Bimodal | 3.55 | 1.50 | 67% |
| Left Skew | 3.55 | 1.36 | 59% |
| Right Skew | 3.55 | 1.64 | 57% |
| Normal | 3.55 | 0.97 | 54% |
A basic overview of best practices in inspecting, cleaning, formatting, and analyzing survey data.
| ID | Q1 | Q2 | Q3 | Q4 | Q5 | Duration | Open-End |
|---|---|---|---|---|---|---|---|
| 001 | 4 | 3 | 5 | 4 | 3 | 8:42 | The staff was friendly but the wait was too long |
| 002 | 3 | 3 | 3 | 3 | 3 | 2:14 | good |
| 003 | 5 | 4 | 4 | 5 | 4 | 9:17 | I love coming here on weekends with my family |
| 004 | 2 | 5 | 1 | 4 | 2 | 0:47 | asdfasdf |
| 005 | 4 | 4 | 5 | 3 | 4 | 7:53 | Coffee is great, parking is terrible |
| Strategy | How It Works | When to Use |
|---|---|---|
| Flag with dummy variable | Create a column (e.g., suspect = 1) to mark questionable respondents. Keep them in the dataset but exclude from primary analysis. |
When you’re uncertain about data quality and want to run sensitivity checks. |
| Quarantine and compare | Run your analysis twice — once with all data, once excluding flagged cases. See if conclusions change. | When sample size is tight and you can’t afford to lose cases without knowing the cost. |
| Weight down rather than delete | Assign lower weights to suspect respondents rather than removing them entirely. | When you have a weighting scheme and want to reduce influence without elimination. |
| Segment and report separately | Treat suspect respondents as their own group. Report their patterns alongside the clean sample. | When “bad” responses might actually reflect a real population (e.g., disengaged customers). |
| Set thresholds in advance | Define exclusion rules before looking at the data (e.g., “anyone completing in under 2 minutes”). Document in your analysis plan. | Always. Prevents post-hoc fishing for rules that conveniently support your hypothesis. |
| Measure | What It Tells You | When It Misleads |
|---|---|---|
| Counts | How many people said X | When you forget that n=6 isn’t a trend |
| Percentages | What share said X | When you ignore the base (35% of 20 ≠ 35% of 200) |
| Mean | The mathematical average | When the distribution is skewed or bimodal |
| Median | The midpoint response | When you care about what’s happening at the extremes—the median ignores them entirely |
| Standard Deviation | How much responses vary | When you report it without context (SD of what?) |
| Top-Two-Box | % who gave the highest responses | When you have a polarized distribution. A high T2B can hide a large group of detractors |
On a five-point likert scale.
| Distribution | Mean | SD | T2B |
|---|---|---|---|
| Bimodal | 3.55 | 1.50 | 67% |
| Left Skew | 3.55 | 1.36 | 59% |
| Right Skew | 3.55 | 1.64 | 57% |
| Normal | 3.55 | 0.97 | 54% |

| Metric |
Product
|
|
|---|---|---|
| Cold Brew | Other | |
| Mean | 4.21 | 3.12 |
| SD | 1.02 | 1.31 |
| T2B | 78% | 44% |
| N | 80 | 120 |
