Three In Ten

I’ll start this post with a disclaimer.  I’m not a mathematician.  My dissertation involved some statistics, and I knew enough to get help from someone more qualified than me.  I learned a lot about how statistics really work from that experience, most notably that I shouldn’t trust my impressions about how the numbers really would end up.  Our intuitions can fool us into thinking we’ve found a pattern when one doesn’t really exist.

Let me offer a hypothetical example related to brass pedagogy.  Let’s say you’re a brass teacher working with beginners and you notice that out of your ten students, three improve with a flat chin embouchure while the other just don’t seem to make it work right.  You find it curious that only 30% can make a flat chin work (after all, that’s what you’ve been told is correct), so you write a method book and point this out.  A graduate student looking for a research project for his thesis reads your book and decides to put together a study to check this.  He tests 90 brass players and finds that 45 of them improve more with instruction that recommends a bunched chin.

A year goes by and someone else decides to replicate this research.  She also tests 90 students and finds that 63 of them improve with a flat chin, 70% of them in fact.  By this time, though, you’ve got a completely new bach of ten new students and you’re teaching them to play with a bunched chin.  Out of them, 80% play better with a bunched chin.

Let’s take a closer look at these statistics and see what we can conclude.

Year     Flat Chin Instruction     Bunched Chin Instruction
Year 1                  3/10  30%                                  45/90  50%

Year 2               63/90  70%                                    8/10  80%

Looking at these stats we see that in both years, the percentage that play better with a bunched chin are higher than the flat chin embouchure.  Only 30% of the players who got flat chin instruction improved compared with 50% who did better with a bunched chin the first year.  The second year found that 70% of the students improved with flat chin embouchure instruction for one sample, while the 80% of another do better with a bunched chin embouchure.  So we can conclude that the bunched chin pedagogy is more effective, right?

Well, not quite.  Let’s take a different look at these numbers.  The total ratio of students who improved with flat chin instruction is 66/100, a success rate of 66%.  The bunched chin instruction shows a ration of 53/100, a success rate of 53%.  In this hypothetical scenario the flat chin instruction is actually better than the bunched chin instruction.  This is an example of what’s known as the Simpson’s Paradox.  It’s one example of how our intuition about numbers can fool us into making erroneous assumptions.

For the record, most of these numbers are made up, are are the studies (there’s very little actual research like this on brass pedagogy).  However, the scenario (flat chin versus bunched chin) and the initial 3/10 number were not.  I’m not stating here that Smiley’s impressions are wrong, just that without more information his statistics are suspect.

Any time I hear numbers thrown around as evidence for someone’s pedagogical theories I get suspicious.  Often when I dig a little further the numbers don’t make sense.  For example, one of my favorite trombonists writes that there are 300 muscles in the face that affect the embouchure, but that would seem to be a rather inflated number (depending on how you count, there are about 33 to 43 muscles in the face).  Another embouchure expert once emailed me about a sample size of exactly 4736 subjects, but neglected to provide a control population in her analysis, nor addressed the issue of self-selection bias inherent in collecting data via the internet.

Us musicians, as a rule, aren’t very well versed in logic and statistics, yet we like to throw around our impressions to lend weight to our theories.  It’s human nature and it is an important step towards improving our understanding and creativity.  The mistake is when we make that the final step, and never go further to actually see if the math adds up correctly.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.