Concept Tests

Teaching Note

Author

Larry Vincent

Published

April 5, 2026

Modified

April 6, 2026

While the Hecuba leadership team is still working through which LA neighborhoods make sense for the next café location, Reina Daniels has been busy on a different front. She’s brokered a deal to place Hecuba whole-bean coffee on the shelves at Erewhon. This could be a meaningful step for a brand that has, until now, only existed as a café experience.

The Erewhon merchandise buyer said she loved the simplicity of the packaging used at Hecuba stores but she worries it might need something more for her shoppers, especially those that might not be familiar with the brand. Erewhon shoppers are sophisticated, values-driven, and willing to pay a premium, but they’re also choosing between a wall of beautifully packaged specialty products from brands with their own compelling stories. She suggested two potential changes to the package design. The first leans into the Colombian sourcing, which she thinks will align with Erewhon shoppers’ focus on holistic ingredients. She also thinks the authenticity of the brand’s small-lot farming relationships will resonate. The second idea she suggested was to lean into the founders’ story and signal that this is a female-founded business.

Rather than crowd the label with too many messages competing for attention, Denise, Adria, and Reina suggest a concept test with Erewhon shoppers to see which idea has the strongest lift in purchase intentions.

What a Concept Test Is—And Isn’t

A concept test is a survey-based experiment that measures whether exposure to a marketing concept changes how people respond to your brand or product. A “concept” can be many things—an ad campaign direction, a product description, a positioning statement, a package design, or a brand name. What makes it a concept test is the experimental structure underneath it. Some respondents see a specific concept, some don’t.

Concept tests are usually implemented in a monadic design in which respondents are randomly assigned to a control or a test scenario, and they then only see one iteration. They don’t know that the stimulus has been manipulated. In this way, a concept test can reveal whether a direction actually lifts the desired outcome above a baseline.

The Survey Architecture

Most concept tests adhere to a simple three-part structure, but it always depends on the nuances of your study.

concept_test A Respondent enters B Pre-Exposure Block A->B C Random assignment B->C D Control C->D E Concept A C->E F Concept B C->F G Post-Exposure Block D->G E->G F->G H Survey complete G->H

The pre-exposure block captures important data before the stimulus is presented to the respondent. This includes screener information used to confirm eligibility (do they buy whole-bean coffee? Would they consider a new brand?), as well as any baseline measures you might want to include in your analysis, such as category involvement, prior brand awareness, purchase frequency, etc. Frequently, questions are asked here that could be affected by order bias or priming effects after the concept is presented. Everyone sees exactly the same content here.

The stimulus is where the groups split. Respondents see one concept, and only one. The control group sees something neutral, perhaps a brief product description with no positioning argument, or the existing creative used in packaging or advertising. These are your counterfactual respondents. They tell you what people say when nothing has changed. It’s the baseline that makes it possible to measure “lift” from proposed changes.

The post-exposure block brings everyone back to a single path. Your outcome measures like purchase intent live here. These are the same questions and in the same order for every respondent regardless of what they just saw.

In Qualtrics, the mechanics are a Survey Flow randomizer block placed after screeners but before any stimulus content. Each branch holds the materials for one condition. The outcome block lives downstream of all branches.

What the Stimulus Actually Looks Like

Concept tests vary enormously in practice. Sometimes the stimulus is a finished package design. Sometimes it’s only a headline or changes in an illustration. Sometimes it’s a paragraph of copy with no image at all. There is no universal template. The right stimulus is whatever faithfully represents the idea you’re trying to evaluate.

What does matter is stimulus isolation. The differences between your control and treatment conditions should be deliberate and limited enough that you can reasonably attribute any difference in outcomes to the manipulation. If too many things change at once, you might not know what actually moved the needle.

This gets genuinely tricky with visual concepts. When you’re testing two package designs, multiple elements can shift simultaneously, including the color palette, typography, iconography, and layout. You can incrementally test each, but that can get cumbersome and expensive. That’s sometimes unavoidable, and experienced researchers accept it as a known limitation rather than a problem to solve. What you want to avoid is accidental variation, such as when differences between conditions creep in through sloppy execution rather than intentional design. That produces uninterpretable data.

For the Hecuba test, the manipulation is deliberately minimal. The bag shape, the label proportions, the typographic system, and the Hecuba figure all remain consistent across conditions. What changes is a coffee bean icon and origin language in Concept A, and a partnership icon and founder language in Concept B.

Here’s what the three conditions look like:

Control Condition

Hecuba Coffee — Colombian Whole Bean

Hecuba Coffee is a specialty roaster based in Venice, California. Their whole-bean coffee is available in a range of varieties and roast levels, sourced from small farms and roasted at their Abbot Kinney facility.

Concept A — Origin/Provenance

Sourced from one of the rarest and most beloved coffee beans.

100% Colombian Coffee

Hecuba sources directly from small-lot growers in Colombia. Our relationships have been built over years from family ties. Limited harvests mean limited supply. Our roasters work each lot until the bean tells them it’s ready. It’s why 100% Colombian coffee has been cherished by coffee lovers the world over.

Concept B — Women Founded

Built from scratch by two women with a bold vision for coffee.

Female Founded and Operated

Hecuba was founded by Adria Marquez and Denise Shaw in 2018 with a single store on Abbot Kinney and a conviction that great coffee starts small. Eight years later, they still source every bean, approve every roast, and run every location. This bag is the product of their diligence, creativity, and love.

Notice what’s consistent across all three conditions: the bag, the label, the typography, the figure. The production value is deliberately equivalent. If one condition looked more finished or more premium than the others, you’d be measuring design quality rather than the positioning argument (the affect bias we mentioned in previous class discussions) and you’d never know which one drove the result.

What’s distinct is the argument each treatment makes. Concept A is about provenance (rare beans, direct relationships, limited supply). Concept B is about the people behind the brand (two female founders who still make every decision). Those are two genuinely different reasons to believe.

Measuring the Right Things

The choice of primary outcome depends on what decision the concept test is meant to inform. Is it consideration? Appeal? Intention? For a packaging test like this one, the question we probably want to answer is how much the label design might move a shopper at the shelf. Consideration or purchase intent are natural dependent variable options.

How likely would you be to consider Hecuba whole-bean coffee the next time you’re shopping for coffee? 1 = Definitely would not consider — 5 = Definitely would consider

Supplement with two or three attribute items tied to what each concept is trying to accomplish. For this test, you might measure perceived quality, brand distinctiveness, and whether the brand feels like it’s “for someone like me.” These secondary measures can explain why one concept outperformed. That explanation is often what the team needs to evaluate the findings from the concept test and determine next steps.

Reading the Result

Compare mean consideration scores across conditions. The difference between a treatment mean and the control mean is your lift. But knowing the direction of the result is not enough. You also need to know whether the difference is statistically significant.

Several analytic approaches work here depending on your design. If you’re running a simple two-cell test (one treatment against a control), a simple t-test is perfectly sufficient. With multiple treatment conditions, as in the Hecuba test, ANOVA is the standard choice, followed by post-hoc paired comparisons. Tukey’s HSD is the most common. If your pre-exposure block collected covariates worth controlling for (i.e., category involvement, purchase frequency, prior brand awareness, etc.), regression gives you a cleaner estimate of the treatment effect with those influences held constant.

But before you run any comparison, check that randomization worked. Pull baseline characteristics by condition and look for systematic differences. For example, if treatment respondents skew younger or more category-involved than control respondents, your random assignment broke down somewhere. You’ll want to find this before you interpret anything downstream.

As you analyze the lift, remember that it is relative, not absolute. A mean of 3.8 in a treatment group looks good, but against a control mean of 3.6, it may be a modest and insignificant finding. Against a control mean of 2.9, it’s probably a meaningful one. Always report the comparison, not just the score. Also, concept tests reduce uncertainty but they don’t eliminate it. A concept that performs well in survey conditions still has to survive pricing, shelf placement, and the gap between stated consideration intent and actual behavior. The test is the start of a go/no-go process, not the final word.


For Your Charrette and Final Assignment

A concept test is a viable primary research method for your final project if your research question involves evaluating how a proposed marketing concept lands with a defined audience. The four design decisions that matter most: Who is your target respondent? What is the concept you’re testing and what specific argument does it make? What is the appropriate control condition? What outcome measures connect to the actual decision? Get those right, and the rest is execution. And remember, complexity can be the enemy. Simple is smart.

AI Exploration Prompts

  • “I want to design a concept test to evaluate [describe your concept and the decision it informs]. My target respondent is [describe]. Help me think through the study design: how many conditions I need, what the control condition should look like, and what I should measure.”
  • “I’m designing a stimulus for a concept test. The concept I’m testing is [describe]. Help me develop a concept board that isolates the key argument without introducing accidental variation. Here’s what the control condition looks like: [describe].”
  • “I’m building the survey for a concept test. The concept involves [describe] and my target respondent is [describe]. Suggest questions for the pre-exposure block — screeners to confirm eligibility and baseline measures that might be useful covariates in the analysis.”
  • “Here are the results of my concept test: [paste summary statistics or output]. My primary outcome was [describe]. Walk me through how to interpret the lift, assess statistical significance, and frame the finding as a recommendation.”