Marketing research isn’t cargo-cult science

Rory Natkiel

2 days ago

If you work in media and spent any time on LinkedIn last week, you probably saw that both Byron Sharp and Brainlabs CEO Daniel Gilbert published critiques of Andrew Tindall’s The Creative Dividend.

Both spread fast, Gilbert’s mostly through the performance and digital marketing community, where the comments ran heavily in his favour. Brainlabs is one of the better performance agencies in the world, so when it comes to digital, Daniel knows his stuff. Several of his specific criticisms of the paper’s methodology and use of statistics are legitimate, and echo Sharp’s objections.

But Daniel has gone much further than Sharp. He’s done something that’s clever and infuriating in equal measures — taken real methodological problems with one paper and used them to articulate a broader scepticism about the entire field of marketing science, including the work of the Ehrenberg-Bass Institute itself (of which Sharp is the director).

In doing so he’s dismissing decades of converging research and the accumulated judgement of some of the most rigorous thinkers the marketing industry has produced. That position deserves some scrutiny, because it is itself flawed in a number of ways.

Applying a laboratory standard to a social science

Gilbert’s implicit benchmark throughout is controlled experimental science: randomised trials, prospective designs, independent verification, no self-report. By that standard, virtually all of social science fails. Economics relies on natural experiments and retrospective data. Psychology has a replication crisis and still produces useful knowledge. Epidemiology spent decades building the case against smoking using observational data, self-report, and databases of people who had already got ill.

None of those fields concluded that the impossibility of perfect methodology meant research was futile. They built systems for triangulating across imperfect studies, assigning weight to stronger designs, and holding conclusions tentatively until they were replicated. In a social science where laboratory standards are impossible, observational meta-analyses are the gold standard.

Daniel also never applies this standard to his proposed alternative. He holds up Booking.com’s 25,000 A/B tests as the gold standard. But A/B tests have their own limitations: they test marginal changes, not structural strategy; they optimise for short-term measurable response, not brand building; they are excellent for conversion rates but have nothing useful to say about long-term price sensitivity or penetration growth. If he subjected Booking.com’s testing programme to the same scrutiny he applies to The Creative Dividend, he would find confounds everywhere. He just doesn’t look.

Ignoring convergent validity entirely

This is the most serious intellectual failure in the article. The Creative Dividend‘s findings do not exist in isolation. They broadly converge with Les Binet and Peter Field’s analysis of emotional versus rational advertising effects, Ehrenberg-Bass’s work on mental availability and distinctive assets, and Kantar’s BrandZ index, amongst others. These are not the same study run by the same people in the same way. They are independent lines of inquiry reaching similar conclusions from different directions.

In science, convergence across independent studies with different designs and different datasets is one of the most important signals you have. The fact that ‘emotional, distinctive, consistent advertising tends to outperform rational, constantly changing, short-term activation advertising in generating long-term business growth’ emerges across numerous studies simultaneously means it is not a property of any one paper’s methodology. It is the finding surviving multiple independent attempts to test it. But Daniel doesn’t engage with this. He treats The Creative Dividend as if it is the only evidence for these ideas, which it obviously is not.

The ‘it’s obvious’ gambit

Daniel begins his article by asking whether anyone really thought bad creative with no media budget was the winning formula. The implication is that The Creative Dividend‘s conclusions are so self-evident they needed no study.

What it also implies is it’s unlikely Daniel ever had to sit in a room and convince someone that spending tens of millions on a TV campaign featuring a talking Russian meerkat is a good idea.

The entire marketing industry spent the better part of the 2010s systematically de-investing in brand advertising and redirecting budgets into short-term digital response, precisely because digital marketing’s measurability made it feel more evidence-based. The industry did not behave as if emotional, long-term, famous brand-building advertising was obviously correct. It behaved as if click-through rates and last-touch attribution models were sufficient proxies for commercial effectiveness.

I have personally worked with household-name brands that shifted their entire budget away from brand to digital performance, trying to help them escape the resulting downward death spiral of decreasing profits, budgets, and job cuts.

Binet and Field’s work, Sharp’s work, and the body of IPA research these papers draw on were actively contested, resisted, and ignored by significant parts of the industry, and performance marketers in particular, for years.

Daniel later goes on to say: ‘“Be distinctive, be emotional, be consistent, give it time, spend enough”… Fine. True enough. Now what?’

But those things are only ‘fine’,’obvious’, or ‘true enough’ now because of the sustained effort of the researchers and practitioners who produced and disseminated the evidence.

Dismissing expert endorsement

Daniel treats Mark Ritson’s endorsement of The Creative Dividend, and those of the former Global CMO of Diageo and the CEO of Effie Worldwide, as evidence that intelligent people can be fooled. But that is not a fair characterisation of who these people are or how they engage with this kind of research.

Ritson is not a passive consumer of marketing research. He has spent twenty years actively critiquing bad research and is not slow to say so publicly. Diageo is one of the most consistently effective marketing organisations in the world. I find it hard to believe that the CEO of Effie Worldwide doesn’t know a thing or two about advertising.

When it comes to the methodology itself, Binet and Field have been explicit about the limitations of their own IPA work and where it can and cannot be applied. The list of people that have contributed to the accumulated knowledge derived from case-study analysis includes Grace Kite, Tom Roach, Richard Shotton, Paul Feldwick, Craig Mawdsley, Bridget Angear, Sarah Carter, Rory Sutherland, Orland Wood and many more. For these specific people to find value in the case study methodology is, in itself, evidence of its validity. They have the pattern recognition to identify junk science.

Daniel is effectively arguing that his 21-point list represents something they all missed. That is possible, but it is not probable. Extraordinary claims require extraordinary evidence, and he hasn’t provided it.

Confusing limitations with invalidity

Every limitation Daniel identifies is real: survivorship bias is a genuine problem with award databases, self-reported data does introduce incentive distortions, and the R-squared point about hidden control variables is legitimate. But there is an enormous difference between ‘this study has limitations that should make us hold its conclusions tentatively’ and ‘this study tells us nothing’.

He opens by saying the methodology is ‘an abomination’ and the ‘conclusions don’t follow from the data’. He closes by saying the paper’s directional claims are ‘probably right’ and that several of its conjectures are ‘plausible’.

But these two positions are not compatible. If the methodology is so broken that nothing can be concluded, then the conclusions cannot be ‘probably right’. If the conclusions are probably right, then something in the methodology is producing a signal despite the noise.

What he has actually demonstrated is that the study is imperfect, that its confidence intervals are overstated, and that its headline claims should be held more tentatively than its author suggests. But that’s not the same as what he says.

The alternative destroys knowledge he relies on

Daniel’s recommendation is essentially: stop reading case studies, run your own tests, build your own measurement system. While this sounds empirically rigorous, it is actually a position of epistemological solipsism — the idea that the only absolute knowledge available is from one’s own direct experience.

Individual brand experimentation, however well designed, produces local knowledge: what worked for your brand, in your category, in this competitive context, at this moment. To generalise from that local knowledge to anything like a strategic framework, you need cross-brand, cross-category, cross-time data. You need exactly the kind of aggregate research the author dismisses.

When he advises marketers to ‘be distinctive, be emotional, be consistent, give it time, spend enough’, he is drawing on Ehrenberg-Bass and the IPA Databank, not brand-level A/B tests. He cannot make those recommendations without the research he is calling futile.

The further problem is that individual brand testing has a systematic bias of its own: it optimises for whatever you can measure in your testing window, which tends to be short-term responses. The central finding of Binet and Field, which he implicitly endorses, is that short-term optimisation at the expense of long-term brand building is one of the most widespread and costly errors in marketing, irrespective of category or company-specific contexts. You cannot discover that from your own A/B tests. You can only discover it from long-run aggregate research that pools outcomes across many brands over many years.

The underlying problem

The final thing worth saying is that Gilbert’s critique implies marketing effectiveness research is a closed system that ignores its own limitations. That’s not accurate. The serious practitioners in this field have spent years publishing exactly the kind of observational meta-analysis Gilbert dismisses, while simultaneously being explicit about what it can and can’t tell you.

And the measurement framework that literature actually recommends isn’t ‘trust the awards database’. The IPA advocates for a combination of Marketing Mix Modelling for long-run channel contribution, Multi-Touch Attribution for short-term digital response, brand tracking for the equity indicators that precede commercial outcomes, and exactly the kind of controlled experiments Gilbert favours. These aren’t competing approaches. They’re complementary ones. The literature Gilbert is dismissing explicitly makes the case for his proposed ‘alternative’.

Daniel is a good statistician making a bad argument. He is right that The Creative Dividend has methodological problems. He is wrong that those problems invalidate either this paper specifically or marketing effectiveness research as a field. His critique would be valuable as a careful assessment of where The Creative Dividend‘s confidence is warranted and where it is not. Instead, he uses legitimate technical criticisms as a rhetorical battering ram to trash diligent, useful research and dismiss an entire tradition of inquiry. In doing so, he dismisses the accumulated work of the most rigorous thinkers marketing has produced.

The fact that marketing science is not physics does not make it cargo-cult science. It makes it social science. All the complexities he identifies are known problems that researchers in the field have been wrestling with for years. The response to those problems is not to stop doing research. It is to do it carefully, triangulate across multiple imperfect studies, hold conclusions tentatively, and keep updating. That is precisely what Binet, Field, Sharp, and the rest have been doing. Daniel’s alternative is not more scientific, it is just more local.