[Harp-L] SPAH 2010 Comb Test Data Analysis

Vern <jevern@xxxxxxx> · Sun, 29 Aug 2010 15:00:48 -0700

I intended to present an analysis of the data sooner.  I apologize for this being late and for its length.  The discussion on this thread has gotten way ahead of me.  I still have the task of reducing 2+ hours of video to something brief, meaningful and interesting.

DISCLAIMER:
Although I believe Brendan Power will be in general agreement with most of what follows, I don't presume to speak for him.

DEFENSE OF PARTICIPANTS:
First, let me defend all of the participants.  They are all outstanding performers on the harmonica.  Any of them could have refused to participate or walked out in protest.  However, they all stayed and cooperated in what turned out to be a tiresome two-plus hour procedure. I am very grateful to them all for doing so.

It is wrong to accuse any participant of bad faith for scoring all or almost all of the attributes of all the combs 3 (average).   They were asked to record their perceptions.  If all the combs sounded the same to them, then all-3's would be expected.  If there are in fact no perceptible differences in sound attributable to comb material, then all-3's were not only honest but also accurate. 

Having honestly reported their inability to hear differences, they are entitled their opinions as to WHY  they could not do so. They could  believe that:
- There are no perceptible differences in sound attributable to comb material.
- There are such differences but the conditions of the test prevented hearing them.  More on this later.

DEFENSE OF THE COMB SEQUENCE:
Second, let me defend myself from the accusation that grouping four plays of the brass combs together was an underhanded attempt to fool/deceive the participants. I had several reasons for doing so:
- It was not necessary for me to dis-re-assemble the harp.  That eliminated any possibility that I was inconsistent in aligning the parts and tightening the screws.
- It was the only means available for determining to what extent the participants were hearing subjective differences. 

DEFENSE OF PARTICIPANTS HEARING DIFFERENCES IN PLAYS OF THE BRASS COMB:
At this point, let me defend the participants who did report differences among brass combs.  It is human nature (from which I am not exempt) to hear/feel what one wishes, expects, and thinks he ought to hear/feel.  That is why tests of drugs are double-blind so that neither the subjects taking them nor the testers administering them know who is getting the active drug and who is getting the placebo.  This fact of human nature accounts for our not-always-rational choices of cosmetics and beverages.  The tendency to draw conclusions and act on incomplete information may have helped our ancestors survive in the wild.

HISTORY OF THE TEST DESIGN:
Brendan proposed conducting the test, coordinated with SPAH, obtained the harmonica, recruited the participants, provided the spectrogram electronics and proposed the format. The credit is his that the test took place.  I modified the harmonica and combs and made the comb changes.......a relatively insignificant contribution.

We made an odd couple to organize the test.  His primary interest was to get the opinions of leading artists about the      many materials currently used for combs.  He was not interested in drawing any overall conclusions.  He favored large numbers of comb materials and rating a large number of attributes for each comb.  

I was interested in answering the question: "Can the players detect characteristic differences among comb materials?" I preferred using only two materials of radically different properties and asking the players to indicate only whether the current comb was the same or different from the previous one played.  The resulting test design was the result of earnest negotiation.

DEFENSE OF THE TEST DESIGN:
Brendan and I exerted every practical effort to create the best possible test conditions. The player's sounds and the overall scene were recorded for future review. The test design and conditions, pictures of the test harp, and the score cards were made public  before the test. We made use of my experience on previous tests to anticipate criticisms.  Everything has been transparent.  

Maximize the ease with which the players can demonstrate the ability to perceive material differences:
- A quiet room dedicated to the test.
- Plenty of time (over two hours) for a calm and deliberate procedure.
- A high quality harmonica checked by Brendan for playability.
- Recruiting of the worlds leading harmonica players...expert artists all.
- Twenty seconds to exercise each comb material in a manner of the player's choice.
- Provision for ten relatively quick (less than 2 minutes) comb changes.

Elimination of spurious variables:
- Use of the same set of reedplates and covers for all plays.
- Same assembler changing combs and making every effort to be consistent with alignment and screw torque.
- Measured distance to the mic for recordings.
- Combs carefully checked for flatness.
- No amplified sound.

Rigorous controls:
- Exposed comb surfaces painted to obscure visible clues.
- Attached weight to mask weight differences among metal and non-metal combs.
- Mint lip balm under the covers to mask characteristic wooden-comb odors.
- Use of ear plugs to minimize the influence of other players.
- Discouragement of conversation among players.
- The test was open to witness by the ( SPAH-attending) public.

No test design can be perfect.  However, the conditions and controls were far better than those under which harmonicas are usually played.

VERN'S ANALYSIS OF THE TEST DATA:
There are many ways to analyze the results.  If you would like to check my procedure or take a different approach, I will email my Excel file to you.  It includes the raw data, formulas, and intermediate values for the following procedure:

I divided the data into two sets, Brass Combs and Non-brass combs.  

I calculated the score ranges* for the five attributes for each of the 6 players, for the brass combs. This produced 30 values of range, all for Brass combs..

I did the same for Non-brass combs.

* RANGE = MAX -MIN values for a set of numbers.  In this case RANGE could have values of 0, 1, 2, 3,  and 4.

Then I subtracted the Brass ranges from the Non-brass ranges to produce 30 values of range differences.
A large positive value would indicate that the player was hearing more differences among the variety of non-brass combs than among repeated plays of the brass combs. It would indicate the ability of the player to hear differences characteristic of different combs.

A small positive or a negative value would indicate the inability of the player to hear differences, or the tendency to hear subjective differences.  

The average of these 30 difference values is -0.53  It is consistent with the inability of the players to hear sounds characteristic of comb materials.  

INTERPRETATION:
Test hypothesis: "Players can perceive differences attributable to comb material."
Null hypothesis: "Players can NOT perceive differences attributable to comb material."

If any player in this test had demonstrated the ability to reliably differentiate among comb materials, the hypothesis would have been proven and the null hypothesis would have been disproven.  This did not occur. 

Theoretically, the POSSIBILITY that some yet-undefined combination of  people and test conditions could prove the hypothesis remains.  However this hypothesis must take its place in an infinity of other unproved hypotheses where it becomes irrelevant.  e.g. "There is a teapot orbiting the sun." is possible, unproven, and irrelevant.

The burden of proof remains on the proponents of the hypothesis.  Until that burden is carried; for all practical purposes, the comb-materials effect does not exist.

Thanks for reading this far!

Vern