Like what you have done with the place...

3DSonics · Jun 23, 2011

Folks,

Quite interesting.

Only, scientific method driven...

Now if an Engineer measures 0.0001% THD on an Amp in a laboratory into an 8 Ohm fixed load and no-one hears anything, did he measure 0.0001% THD?

Ciao T

Dev · Jun 23, 2011

Welcome back T. Good to see you here.

sq225917 · Jun 23, 2011

Indeed good to see you back T, always a welcome and knowledgeable voice.

RobHolt · Jun 24, 2011

Great to see Thorsten back and I know he's been busy these past few years.

In answer to his question, yes he measured 0.0001% into that specific load, but that doesn't mean the amp won't give a different result into a different load.
If they can't hear the distortion into the load being driven, it doesn't matter.

I would say that as part of a basic competency spec, the ability to maintain low THD into a reasonable load window is essential.

3DSonics · Jun 24, 2011

Rob,

RobHolt said:
I would say that as part of a basic competency spec, the ability to maintain low THD into a reasonable load window is essential.

How low THD would you suggest and why?

What is the lowest THD that can be heard?

Is single number THD in fact useful to determine audible distortion?

Ciao T

RobHolt · Jun 24, 2011

The audibility depends entirely on the distortion spectrum and type. Single figure THD is useful but not the complete answer certainly.

Plenty of historic research indicates 0.1% as a safe figure, though other research shows that if the distortion is comprised mainly of crossover then even that spec is very marginal.

Where distortion is primarily 2nd harmonic in nature, a few percent is often quite acceptable, whereas 2-3% of crossover would be quite unpleasant!

So I think we have to build in some margin, certainly where SS amps are concerned and you don't set out to make a characterful amplifier. I'd therefore suggest that both THD and IMD are therefore kept below 0.05% for a safe spec. I certainly find that SS amplifiers meeting that criteria, and assuming flat response, low noise and other basic competencies are met, are indistinguishable on a blind test.
You can go to 0.00001% but that isn't going to sound any different to one at 0.005% - whatever the composition, IMO.

RobHolt · Jun 24, 2011

Put something up in the Audio room, or put your position and folk can chime in.

3DSonics · Jun 24, 2011

Hi,

RobHolt said:
The audibility depends entirely on the distortion spectrum and type. Single figure THD is useful but not the complete answer certainly.

Plenty of historic research indicates 0.1% as a safe figure, though other research shows that if the distortion is comprised mainly of crossover then even that spec is very marginal.

I think Geddes/Lee and D.E.L. Shorter and Olsen before them showed that single number THD is useless.

We have numbers from "3% 2nd HD" on one side to much less than "0.05%" on the other side. I can suggest that even more than 3% can be inaudible and even less than 0.05% can be audible.

I would also suggest that the "low measured THD = high fidelity" myth is already thoroughly busted. Which does not stop it from being used widely.

This is my problem with the "scientific method driven" approach. Most of the supposed "science" in audio is just a bunch of marketing myth's with a near complete absence of evidence supporting it (this goes as far as the ABX testing).

So we essentially first need to pull down this whole edifice of poor science appart and rebuild it, if we want to apply such a "scientific method driven" approach. I doubt we here have the necessary resources to do so.

So at least I would suggested needed is an approach that considers the supposed "scientific method driven" approach but with a very large pinch of salt and the willingness to correct or even abandon supposed scientific precepts when shown highly questionable or false.

Ciao T

RobHolt · Jun 24, 2011

I would agree that ultimately, chasing ever lower distortion figures is pointless. Just as eliminating response or phase variations beyond certain points becomes a case of 'I can' rather than 'I need to'.
That is pure specmanship, I agree.

But we have to establish benchmarks that apply in most cases. Yes we can show that under certain situations THD at 3% is inaudible, but that is a very rare situation and not applicable to most amplifiers, using circuits barely modified form decades old standard circuits.

If established thresholds for audibility are wrong - and that rather depends who's work you use as a reference - then I agree that it should be challenged.
For me the clincher is always the unsighted test, the results of which suggest to me that most, lets call them 'errors' found in modern SS audio sit below audibility.

To be fair, I don't think that many claim that 'low THD = high fidelity' - I certainly wouldn't make any such claim, but it most certainly does influence fidelity. The acceptability of such influence is of course entirely subjective.

3DSonics · Jun 24, 2011

Hi,

RobHolt said:
For me the clincher is always the unsighted test, the results of which suggest to me that most, lets call them 'errors' found in modern SS audio sit below audibility.

Blind tests of the kind commonly performed in audio are subject to a range of influences that mitigate against their ability to reliably distinguish medium and small audible phenomenae.

Another way of saying this is that these tests by their very nature are subject to large expectation bias (aka. Placebo/Nocebo), a large bias against detecting small differences in the Statistical principles and often involve quite large amounts of "test stress" (the same kind of stress that seems to wipe ones mind clean of stuff in School/Uni tests that one knew clearly the day before).

All of these factors mean such tests will have to be revised to make them sensitive to the phenomenae or accepted as being flawed and lacking statistical significance and power (aka "waste of time").

Ciao T

RobHolt · Jun 24, 2011

3DSonics said:
Hi,

Blind tests of the kind commonly performed in audio are subject to a range of influences that mitigate against their ability to reliably distinguish medium and small audible phenomenae.

Another way of saying this is that these tests by their very nature are subject to large expectation bias (aka. Placebo/Nocebo), a large bias against detecting small differences in the Statistical principles and often involve quite large amounts of "test stress" (the same kind of stress that seems to wipe ones mind clean of stuff in School/Uni tests that one knew clearly the day before).

All of these factors mean such tests will have to be revised to make them sensitive to the phenomenae or accepted as being flawed and lacking statistical significance and power (aka "waste of time").

Ciao T

They aren't perfect, but I'd argue far more reliable than sighted testing, which introduces far more destructive barriers.

However, I'd argue that all sighted testing is a waste of time, purely because so many known variable haven't been removed.

ABX can and does produce positive results.
Look at the codec listening testing on Hydrogen Audio as an example.
Or Krueger's ABX testing of power amplifiers, where null results are achieved with a single signal pass but clear differences heard when the signal passed multiple times. In other words, make the difference truly significant and a blind test can quite easily reveal it.

Of course the blind test conditions should be designed so as to encourage a positive result, and you have to ensure that the test system isn't skewed to favour one particular product. Precautions around speaker loading on amps under test, termination impedance for cables etc. You have to ensure you are testing the product and not the interface.

3DSonics · Jun 25, 2011

Hi,

RobHolt said:
Of course the blind test conditions should be designed so as to encourage a positive result,

Any suggestions how this may achieved? Maybe you could suggest some documented blind tests that where carried out like that?

Ciao T

RobHolt · Jun 25, 2011

3DSonics said:
Hi,

Any suggestions how this may achieved? Maybe you could suggest some documented blind tests that where carried out like that?

Ciao T

I would refer you to the Hydrogen codec tests and also Krueger's multiple pass amplifier tests. Both identify differences and the latter test is designed to magnify non audible differences until they become audible.
If you aren't familiar with his testing, Krueger found that listeners very often couldn't differentiate between solid state power amplifiers under blind conditions. However he discovered that differences started to emerge if you made multiple passes of the signal through the amp. Some amps could be identified after only two passes, others four or five. Essentially his argument runs that good power SS power amps sound the same when operating within stated limits because their errors were below audibility. Only when magnifying those errors did they become audible.
This harks back to the days when Peter Walker claimed that you could daisy-chain many power amplifiers, suitably loaded and padded, and listeners would struggle to tell the difference between one and multiple amps.

However, by encouraging a positive result I mean designing a test that satisfies a number of criteria, including but not limited to:

- Fully understanding the interface(s). The interface must be benign to the equipment under test. So for example, if testing interconnects you wouldn't use a passive pot preamp and power amp with significant capacitance on the load. You end up testing the interface rather than the cable as i mentioned earlier. Ditto amplifiers driving heavily reactive loads. Amplifiers with high output impedance are likely to sound quite different into such a load, where you might well get a 'no difference' situation into a more typical loading. Of course such a test would be excellent if you were actually investigating the effects of output impedance.
Just a couple of examples.

- The participants should ideally have heard the products under sighted conditions before the blind test. This shows that A, they are open to accept differences and B, the sighted/unsighted results can be compared.

- The conditions should be as 'normal' as possible, ie not 'lab' conditions which would be quite alien to most listeners.

For starters at least.

We can go on and refine, and please add your own thoughts as I'd like to read them.

3DSonics · Jun 25, 2011

Hi,

RobHolt said:
I would refer you to the Hydrogen codec tests and also Krueger's multiple pass amplifier tests. Both identify differences and the latter test is designed to magnify non audible differences until they become audible.
If you aren't familiar with his testing, Krueger found that listeners very often couldn't differentiate between solid state power amplifiers under blind conditions. However he discovered that differences started to emerge if you made multiple passes of the signal through the amp. Some amps could be identified after only two passes, others four or five. Essentially his argument runs that good power SS power amps sound the same when operating within stated limits because their errors were below audibility. Only when magnifying those errors did they become audible.

Th HA Codec test I missed, I will have a look for it.

Knowing that Arney "ABX Mafia Don" Krueger was involved suggest that the ABX protocol was used. In this case my earlier objections stand fully.

Actually, IF I see the name of one of the ABX Mafia linked to a blind test, I toss it out straight away for excessive "null result" test.

RobHolt said:
This harks back to the days when Peter Walker claimed that you could daisy-chain many power amplifiers, suitably loaded and padded, and listeners would struggle to tell the difference between one and multiple amps.

This test is not really relevant, we did use this test a long time ago in East germany and found that we could not tell high quality (studio grade)audio transformers until we daisy-chained at least 5-8, while any active stage we had available (including my own design, which measured much better than any transformer) only needed one piece. Monitors where Schulze TH315 and studio grade tube Amp's (class AB PP) with EQ for the speakers designed into the feedback loop.

RobHolt said:
However, by encouraging a positive result I mean designing a test that satisfies a number of criteria, including but not limited to:

- Fully understanding the interface(s). The interface must be benign to the equipment under test. So for example, if testing interconnects you wouldn't use a passive pot preamp and power amp with significant capacitance on the load. You end up testing the interface rather than the cable as i mentioned earlier. Ditto amplifiers driving heavily reactive loads. Amplifiers with high output impedance are likely to sound quite different into such a load, where you might well get a 'no difference' situation into a more typical loading. Of course such a test would be excellent if you were actually investigating the effects of output impedance.
Just a couple of examples.

- The participants should ideally have heard the products under sighted conditions before the blind test. This shows that A, they are open to accept differences and B, the sighted/unsighted results can be compared.

- The conditions should be as 'normal' as possible, ie not 'lab' conditions which would be quite alien to most listeners.

For starters at least.

I do not think that your suggestions really address the main problems.

Most of these do not relate to technology or "normal" conditions, but to the same problems ALL blind tests (including those in pharmacy and medicine) face.

There is a reason why ALL blind tests outside audio employ large numbers of subjects. And these reasons do not suddenly disappear (outside audio mythology of course) because we are suddenly dealing with audio.

So, I would like to see more that addresses the underlying statistics and the related problems that relate to the use of small sample sizes (there are ways around this that do not need lecture theatres full of subjects) as well as the placebo/nocebo problem that "blind tests" fail to avoid.

Ciao T

RobHolt · Jun 25, 2011

3DSonics said:
There is a reason why ALL blind tests outside audio employ large numbers of subjects. And these reasons do not suddenly disappear (outside audio mythology of course) because we are suddenly dealing with audio.

So, I would like to see more that addresses the underlying statistics and the related problems that relate to the use of small sample sizes (there are ways around this that do not need lecture theatres full of subjects) as well as the placebo/nocebo problem that "blind tests" fail to avoid.

Ciao T

It depends entirely what you are trying to achieve.

I agree that a large sample is desirable - larger the better - but you have to draw a line at what is practical. For our purposes we can only aim for better testing, not perfect testing but I would maintain that even the most basic of blind tests, even with just two participants, is preferable to sighted testing and more likely to deliver a true result.
If you have sighted, followed by unsighted listening tests and you ensure that you change nothing else, the unsighted test has to be superior, whatever the sample size.

I would however pick up your point about small sample sizes and audio, because again it depends entirely what you are trying to establish, a general rule to be applied across the industry (in which case you need large samples and many sessions) or an individual or small group experience (where you don't).
The blind test result only matters to the individual listener. So you might take such a test and reliably detect differences, while I might fail the same test. If the products I'm testing are expensive amplifiers, well I've just save myself some cash, while you have to open the wallet.

Excessive 'null result' test results might just mean there is nothing to hear......

Doesn't mean the test was wrong.

3DSonics · Jun 25, 2011

Hi,

RobHolt said:
I agree that a large sample is possible - larger the better - but you have to draw a line at what is practical.

Why?

We are talking science here. If what is practical to do cannot be relied upon to provide trustworthy results, why do it and make claims about it?

RobHolt said:
For our purposes we can only aim for better testing, not perfect testing but I would maintain that even the most basic of blind tests, even with just two participants, is preferable to sighted testing and more likely to deliver a true result.

Actually, either test, especially if organised ABX, will be useless, pointless and a waste of time. Statistically speaking. And statistically speaking, while the blind test may be marginally more likely (or quite possibly not) to return a "true result", the increase in likely hood still leaves a bigger gap between here and proxima centauri.

One could change the protocol and statistical method, to dramatically reduce the gap (eg go from ABX to double, or better tripple blind preference tests with questionaires that yield additional info, possibly even direct brainwave observation, well, at least if one wanted to know what is really happening.

RobHolt said:
The blind test result only matters to the individual listener. So you might take such a test and reliably detect differences, while I might fail the same test. If the products I'm testing are expensive amplifiers, well I've just save myself some cash, while you have to open the wallet.

Well, I never do DB Testing of "expensive amplifier vs. cheap amplifier" (actually it is the kind of test beloved by debunkers and pointless in any extent) nor would I buy an expensive Amp that "won" a blind test.

Moreover, what if the direct "ABX" identification returns a "null" but a following (otherwise identical) preference test shows a marked (and statistcally significant) preference for one of the two items (that previous where not be able to be distinguished), where do find ourselves?

Ciao T

RobHolt · Jun 25, 2011

Science indeed but not perfection - you can't have that.

As to 'why', well simple practicalities such as time, effort and money. I only argue that blind testing is superior to sighted, not that it ensures a perfect result every time. If you have the budget of a multinational plc and need to produce results on which the reputation and shareholder value of your company stand, go ahead and spend a fortune on a test in which every conceivable variable has been eliminated. You can also have huge sample.

You fall into the classic trap of making something that is essentially very simple into something that in needlessly complex, and with respect, what you propose is a world away from the requirements of simply comparing two items of audio equipment at home.

So lets go back to practical examples.
We compare two cables in a sighted test and ask say three participants to give their responses. All hear clear differences.
Changing nothing in the system or room, you repeat the session with the cable identities disguised and again take the responses.
If the differences are indeed real, those differences should be heard in both tests and the comments via each test should match. If they don't, well it is highly likely that factors other than sound have influenced the result.

That's it - it really is that simple.
Could you go global and claim that 'all cables sound the same' - absolutely not.
Have we shown that what the participants heard initially is likely to be false?
Yes we have.

Moreover, what if the direct "ABX" identification returns a "null" but a following (otherwise identical) preference test shows a marked (and statistcally significant) preference for one of the two items (that previous where not be able to be distinguished), where do find ourselves?

We find ourselves questioning one or other of the tests.
Or it could simply be that one test used less than perceptive listeners.
Which is why I said results have to be fairly parochial unless the test is conducted on a grand scale, with the resources that would require.

Markus S · Jun 26, 2011

Here's what I wrote elsewhere recently:

Double-blind testing is an excellent tool. Like all tools, however, it must be fit for the job. For a double-blind test to be credible, I would want to be convinced that it not only eliminates false positives but also false negatives. I.e., I'd like to see some evidence that the DBT set-up (including the participants) is sensitive enough that small differences that should be (just about) audible according to accepted audio wisdom will indeed be detected. Then and only then will I accept a specific DBT's null result for, say, cable testing as valid.

3DSonics · Jun 26, 2011

Hi Markus,

Markus S said:
Double-blind testing is an excellent tool. Like all tools, however, it must be fit for the job. For a double-blind test to be credible, I would want to be convinced that it not only eliminates false positives but also false negatives. I.e., I'd like to see some evidence that the DBT set-up (including the participants) is sensitive enough that small differences that should be (just about) audible according to accepted audio wisdom will indeed be detected. Then and only then will I accept a specific DBT's null result for, say, cable testing as valid.

Seconded in full.

Ciao T

nando · Jun 26, 2011

excuse me , but surely the test of any audio product is definitely up to the individual ear! regarding being based on spec's ?

Like what you have done with the place...

3DSonics

away working hard on "it"

Dev

sq225917

Exposer of Foo

RobHolt

3DSonics

away working hard on "it"

RobHolt

RobHolt

3DSonics

away working hard on "it"

RobHolt

3DSonics

away working hard on "it"

RobHolt

3DSonics

away working hard on "it"

RobHolt

3DSonics

away working hard on "it"

RobHolt

3DSonics

away working hard on "it"

RobHolt

Markus S

Trade

3DSonics

away working hard on "it"

nando

nando

Latest posts