Bobdog wrote:
I said I wouldn't reply to yeha anymore, but I cannot resist. This is exactly the sort of stupid-, oops, I mean "pseudo-" science I was talking about before.
http://www.pcavtech.com/abx/abx_wire.htm wrote:
PSACS ABX Test Results
PSACS ABX Test Results
Interconnects and Wires
Interconnects and Speaker Wires Result Correct p less than Listeners
$2.50 cable vs. PSACS Best NO DIFFERENCE 70 / 139 = 50% - 7
$418 Type "T1" vs. Zip Cord NO DIFFERENCE 4 / 10 = 40% - 1
Type "Z Cable vs. Zip Cord NO DIFFERENCE 70 / 139 = 50% - 7
$990 "T2" Cable vs. Zip Cord NO DIFFERENCE 16 / 32 = 50% - 2
In the first test, five specialty interconnects from AudioQuest, MIT, Monster Cable, H.E.A.R., plus Belden cable with Vampire connectors were compared to a $2.50 blister pack RCA phono interconnect. Listeners used Etymotic Research ER4 in-ear phones driven by the headphone jack of a Bryston 2B power amplifier.
The next three tests are the data from Tom Nousaine's "Wired Wisdom: The Great Chicago Cable Caper", listed on the ABX periodicals page.
The Type "T1" cables were compared on a system including an Sumo Andromeda power amp and JS Engineering Infinite Slope speakers, by the system's owner. He chose his own program material and had no time limit.
The Type "Z" cables were tested on the system of a high end audio shop employee including: Snell type B-Minor speakers; Forte Model 6 Power Amp; and an outboard DAC. He used his own program material selected to show the differneces he expected.
The Type "T2" cables were compared on a system including a Denon DCD-1290 CD player, and Accuphase P-300 power amplifier, and Snell KII speakers.
.
Here is a set of data from the link suggested by yeha, this looks very scientific and conclusive. It is about cables (the most controversial topic here), and it consistently shows that a group of listeners cannot tell the difference between hi-end cables and Brand-X cables. Wow, quite a finding, no?
Oops, clicked submit before I meant to...
... I continue. Let’s look at these data. The p-value is the percent chance that results leading us to reject the null hypothesis (the cables are not audibly different) are due to random chance (e.g. sampling error), smaller is better. Although they give a “-“ for the p vale, there is some value there, it is just very large—meaning we cannot reject the null with any level of statistical significance. This is where the problems begin.
First of all, ABX massively and incorrectly inflates their “n” (number of listeners) in “$2.50 cable vs. PSACS Best” test by aggregating all of their test results (note there are only 7 listeners but 139 “tests” (at first I thought this was just a type-o until I figured up what they were up to—I was thinking “how do you get 139 observations from 7 people???)). You can, of course, add up observations and test for significance, but ONLY if the observations are independent. By using the same 7 testers in each test, they have clearly violated independence. They really have several tests with 7 observations in each.
Well, you ask, what does this matter. Well, as I am sure Mr. Science, yeha, would tell you, significance depends upon the number of individuals tested. This is easy to see when we note that the standard error (SE) is computed as (1/n*sigma(x i – x bar)^2)^-2| x bar = the sample mean; and that the p value is a function of the SE: for a normal distribution p. = 0.1 is about 2SE (of course for n = 7 we must use the slightly more restrictive T-distribution, not the normal distribution). But note that SE decreases in n, so for a very large n statistical significance is easy to find, but for a very small n, significance is very difficult to obtain. As we generally consider n >/= 30 to be “large” (where the law of large numbers begins to operate), significance at n = 7 is almost incomprehensibly difficult (though not impossible) to achieve.
Given that anything we observe in an n = 7 observe is due to sampling variation, and significance is pretty much unachievable, it is almost laughable that yeha holds up these tests as scientific! Of course we cannot use them to show that cables DO make a difference, but we really cannot tell ANYTHING from these data. (Intuitively, say you only tested seven people and one was deaf. Since one in seven people in the population is not deaf, that one individual would badly skew the results—against hi-quality cable.)
But it gets worse. N = 7 is the LARGEST n used here! One test, “$418 Type "T1" vs. Zip Cord” has n = 1, that’s right, ONE, listener. Scientific? Hahahahaha. This violates everything statistical. Namely, n must be greater than k (the number of variables tested) in order to get ANY results. Yet we have a case here of n = 1, k = 1. Throw this test out.
Finally, we do not know if these individuals were randomly sampled. Given the embarrassingly amateur nature of the rest of the test, I doubt it. If randomness in selection is not observed, then, again, we can have no confidence in our results because we may have a skewed sample-draw from the population.
I am not against double blind testing, but do it competently, for goodness sake. And yeha, stop posting goofy things like this, it only makes you look foolish… which I am sure you are not, really.
Let’s get n > 30 randomly selected people in a suitable room, do our tests in an actual scientific manor. those are results I would be interested in seeing.
Again, I came to SPCR to talk computers and I end up talking audio (my OLD hobby) and statistics (one of the main subjects in my PoliSci Ph.D.). Oh well, maybe some one will actually learn something from this.