The Plural of Anecdote Is Not Evidence

via Saturday Morning Breakfast Cereal

Recently a conversation about my thoughts on the tongue controlled embouchure spun off into a different topic that I thought deserved a discussion of its own.  It is actually something that I’ve alluded to a couple of different times before, but it’s a very common misconception.  So common, in fact, that many of us brass teachers and players frequently make this fallacious argument to support ideas that may or may not actually be correct.  This is taking our personal stories and concluding from them that this can then be applied to other brass players in general.

There is a difference between “anecdotal evidence” and statistical or scientific evidence.  The former is when someone uses personal stories, or those of other people, as evidence for something you’re already prone to believing.  This is different from collecting data using controls designed to reduce the chance of cognitive bias and compiling it using proven statistical methods.

We are all prone to this cognitive biases, it’s human nature.  In brass playing and teaching this can take many forms.  One of the most obvious examples I often come across are brass teachers who instruct their students to play with the same mouthpiece placement as they do.  The assumption is that if this placement works best for the teacher, then it must be the correct one so all students should do the same. There are so many examples of this in the literature and if you read enough of them you’ll see authors recommending 2/3 upper lip and others recommending 2/3 lower lip, others 50/50.  Clearly this isn’t a very accurate way to develop good brass pedagogy.

As another example, one commenter responded to my discussion on the Balanced Embouchure method’s intentional use of a bunched chin.

I did my own research via youtube, old photos etc. From what I saw, overwhelmingly, the worlds best players are not using flat chins.

Essentially, this commenter took a group of anecdotes (collecting whatever photos and videos you happen to personally come across) and used this to confirm his already held belief that a bunched chin is helpful to good brass technique.  Now I should state that there could be a chance that many brass players play best with a bunched chin (I highly doubt this, for the record), but merely looking around online for things that already fit your beliefs does not make for a compelling argument.  In order to truly understand if your hypothesis of a bunched chin being best for brass technique is true you must subject it to research specifically designed to falsify your idea.  It’s much too easy to allow your personal bias to unintentionally cherry-pick your data and skew the results otherwise.

Who are the “world’s best players?”  What percentage of them use flat chins compared to a bunched chin?  How can you tell with a grainy image or video?  Are these great players actively working to eventually eliminate their bunched chin or are they intentionally playing this way?  These questions, and others, are important to control for before you make any definitive recommendations.

Another element to the anecdotal evidence fallacy is that a large number of stories makes for more compelling evidence.  Lucinda Lewis wrote:

Since all of the studies cited by wilktone are quite antiquated, it stands to reason that a new paradigm regarding mouthpiece pressure would emerge from the more than 5000 cases of embouchure overuse I have documented since 2002. With such a huge number of injured players, 98% of whom reported having developed protracted embouchure problems following a period of severe embouchure overuse, embouchure overuse syndrome can no longer be classified as a hypothesis.

When pressed about how she collects her data and analyzes them Lewis became evasive, so I can’t tell for certain if she is using appropriate methods with proper controls.  However, 5000 test subjects is an unbelievably high number of players to examine, analyze, and treat.  Simply storing all that documentation would take up a huge amount of physical/hard drive space and all the hard copies of all the subject consent forms (required by any IRB when using human test subjects) would take up a file cabinet.  It’s an enormous amount of work doing this with just a handful of subjects (my dissertation used only 34), so when a researcher goes through this hassle they generally want their community of peers to understand that they went through that trouble.  I suspect that most of Lewis’s subjects are really just personal anecdotes collected from emails and phone conversations.  Self reporting itself is quite biased, as I suggested in my reply (follow the link above).  Not to mention that demanding playing schedules are quite common for music students and professional musicians, so I would find it likely that pretty much any brass player you ask can think back to a recent period of demanding playing schedules, whether or not they developed embouchure problems.  As Lewis’s documentation seems to be confusing a correlation (demanding playing) with causation (embouchure problems), I think it’s fair to take Lewis’s statistics with a handful of salt.  Particularly when one of her responses actually would appear to confirm my suspicions that her “evidence” is simply a collection of anecdotes:

I know of no other embouchure research that is based upon such an enormous sample of personal experiences.

The plural of personal experiences (anecdotes) does not equal evidence, no matter how many of them you collect.

Another commenter wrote about Lewis’s approach:

Overuse, is about being pushed beyond your limits, it was with me, without appropriate breaks, no matter the mechanics. Simply put though, Lucinda’s course works.

Again, it would be a fallacy to take your personal experience working through Lewis’s method and leap to the assumption that it works for everyone, or even most.  Our bodies are quite good at recovering on their own, given proper time and careful rebuilding and sometimes people just get better.  Without applying appropriate controls there is no way to state that Lewis’s approach worked for a particular player because of or in spite of her approach.  Simply put, we really don’t know.

Lastly, there is a similar fallacy that unless you happen to have personal experience in a particular thing then your opinions should be trumped by someone who has.  For example, another commenter wrote:

There’s only one way to understand The Balanced Embouchure method (BE for short) and that’s to do it. Those who form conclusions based solely on reading the book and analysing it through the lens of their own beliefs can’t help but miss the bigger picture.

I might argue that having such experiences, particularly positive ones, would actually make one more prone to the cognitive biases that lead us to believe something without good evidence.  Especially if you happen to have a professional and financial connection to the particular method, as the above commenter does.  Popularity doesn’t make something true either, unfortunately.

Now before you comment that I’m being close minded or am myself a victim of my own confirmation bias (I am the first to admit that I have been and will undoubtably continue to be biased), please understand that in pointing out the flaws in the above examples I am only trying to show that we, as a profession, should be more careful in how we develop our ideas, how we test them, and in the way we offer our instructions and demonstrations.  Observant readers of this blog may notice that I try to be very careful to qualify my statements when I’m speculating or when my ideas are based on things that have been shown to be true.  Just because the above arguments are fallacious in some way doesn’t necessarily mean the author’s opinions are wrong, just that the evidence they offer doesn’t actually suggest it is true.  Nor does criticizing them mean that I’m right.

It’s OK to say, “I don’t know, but this is what I think and why.”  In fact, I think it’s much better to simply say so when this is the case.

/rant

fuel

I agree that people rely on anecdotal evidence far too often, especially when they don’t have to; however, the fact is that although scientific evidence is obviously superior, significant anecdotal evidence does absolutely suggest conclusions. E.G. informally surveying 15 people to try and determine what toothbrush is most commonly used in your area obviously is going to be prone to bias, but is still going to add more likelihood to whatever conclusion it brings. It’s not everything and shouldn’t be completely trusted when there are small sample sizes or possible biases, but it is better than nothing.

The example you give of “I don’t know, but this is what I think and why,” seems to be the best way to go about all of this. Until actual studies are done on a topic and if people are pressed to make a decision, they ought to go with their best reasoning and whatever evidence they can find.

Dave

I agree that people rely on anecdotal evidence far too often, especially when they don’t have to; however, the fact is that although scientific evidence is obviously superior, significant anecdotal evidence does absolutely suggest conclusions.

Thanks for your comment, Fuel. I think, however, that you missed one of my major points. No matter how many anecdotes you have, they will always be tainted with bias. As the old expression goes, no matter how high you stack cow pies, it won’t turn into a bar of gold.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.