Re: [Harp-L] comb test

"Doug H" <dough.harpl@xxxxxxxxx> · Sat, 28 Aug 2010 19:53:40 -0600

As a mathematician, and generally science oriented kinda guy, I sympathize with Gary's concerns but still think JR's post does a pretty good job of clarifying and summing up the situation, especially considering that the audience (harp-l) is a broad selection of folks from all walks of life.  

Doug H

----- Original Message ----- 
  From: Gary Warren 
  To: Jonathan Ross ; harp-l@xxxxxxxxxx 
  Sent: Saturday, August 28, 2010 3:25 PM
  Subject: Re: [Harp-L] comb test

  GW
  I'm going to break down and respectfully respond to some of your
  post (2nd paragraph) some of which is scientifically incorrect.

  JR:
  >But in order to have valid results you need several cases where comb
  >A is compared with comb A.

  GW
  A better test exists if you have blind sample.  But it isn't absolutely
  necessary.  If the respondents don't know that the test doesn't include
  same/same samples, they cannot blindly respond that there are heard
  differences every time.

  JR:
  >The test must be blind, and must also
  >have controls.  If the combs are always different, then there's no
  >way to judge if people really can tell--are they hearing actual
  >differences or just perceiving the differences as part of the test
  >process.

  GW
  Test always include perception.  That's the nature of it.  Removing as much
  bias as possible is preferable, as is using A/A and B?B testing.
  But you didn't see Pepsi going against Pepsi in the Pepsi Challenge
  that was in the ads a few years ago.  And they have to have significant
  proof before they put out the ads (or suffer consequences legally).

  JR:
  >If the amount of perceived difference is the same for two
  >samples of the exact same comb as it is for two different combs, then
  >the conclusion can be made that any differences in the evaluation are
  >not physical but psychological.

  GW
  This is the part that is incorrect.  All you can say is that the respondents
  failed to identify differences in the tests, or failed to validate 
  the hypothesis
  that there was a difference.  You can neither say there is no difference or
  extrapolate to say the difference is from some other psychological rather
  than physical.  It could be the nature of the respondents, their hearing
  acuity, the ambient noise level in the room, the sound actually being
  different because it was played by human beings, etc. etc. etc.

  The biggest thing I can say in testing is that all you can do is fail
  to prove a hypothesis.  Doing so does not prove the opposite or
  extrapolate to other proofs.  If this were the case all of the tests
  which happened before we were able to split an atom, would have
  "proved" that there could be no atom bomb.   Meaning the tests
  didn't prove it couldn't be done, they just failed to prove it could.
  Once it was done, well, that changed everything.  If you think of
  somaliesrs (sic) for wine or coffee tasters or professional tobacco smellers
  or color experts, or tea blenders, or audiophiles, violin builders 
  you find those who
  can ascertain differences beyond what the "normal" human is thought
  to be able to do.

  JR:
  >If they can perceive that it is
  >indeed the same comb being tested, then that is significant.

  Potentially.  If you have a large enough sample and conduct the
  right statical test on the results.

  >The
  >evidence given suggested that the players couldn't tell when the same
  >comb was used.

  Or that they could tell when two discretely different combs were being
  used.

  What has been done by another round of this testing is interesting,
  but scientifically it  didkn't  prove that respondents can discern
  someone else playing different combs and identify
  the difference.  Failure to achieve this result in a test will NEVER prove
  that a difference does not exist.  It only fails to prove that it does, which
  is not the same thing.  (as in the example above).  Such is the nature
  of scientific inquiry.  While people often do choose to flip a test and
  assume the inverse to be true, it is categorically and incorrect thing
  to do.

  So for my silly example:  I'll say that you cannot extract hydrogen
  from H2O in an economically viable way where you expend less energy
  than you obtain by extracting it.  I can say it has never been proven
  to be something that can be done..  I could even shout from the rooftops
    that it will never be so, and perhaps I would be right.  Or perhaps some
  scientist in the future will find a process and we'll not be dependent on
  oil as our world's primary fuel source.    Now I offer you $1000 to prove
  me wrong and give me the cost effective method.  hehehehehe
  But be sure to post it to me off list.

  Thanks for reading.
  GW

  At 01:50 PM 8/28/2010, Jonathan Ross wrote:
  >>Isn't it possible that these testers became frustrated, especially
  >>considering that the brass comb was used 4 times in a row near the
  >>beginning of the test?
  >
  >
  >You have to reuse the same comb several times in a row--it creates a
  >bassline reference for the other results.
  >
  >There may have been serious issues with the test and how it was run
  >and conducted (not being there I can't say).  These would be good
  >reasons to either back out or take the test as honestly as possible
  >and publish a written description of what the tester regards as
  >flaws--preferably asking that the letter be published with the
  >results.  Pre-meditated scuttling of the data was neither mature,
  >helpful nor respectful.
  >
  >
  >>I would have rather seen a test where comb A was compared to comb
  >>B, with many different pairings of combs.
  >
  >
  >But in order to have valid results you need several cases where comb
  >A is compared with comb A.  The test must be blind, and must also
  >have controls.  If the combs are always different, then there's no
  >way to judge if people really can tell--are they hearing actual
  >differences or just perceiving the differences as part of the test
  >process.  If the amount of perceived difference is the same for two
  >samples of the exact same comb as it is for two different combs, then
  >the conclusion can be made that any differences in the evaluation are
  >not physical but psychological.  If they can perceive that it is
  >indeed the same comb being tested, then that is significant.  The
  >evidence given suggested that the players couldn't tell when the same
  >comb was used.
  >
  >
  >
  >JR Ross

  --
  Gary "Indiana" Warren

  "The important thing is not to stop questioning."
                                   Albert Einstein