Empathy List Archives

AK

Attila Kinali

Wed, Jan 4, 2017 8:12 PM

Hi,

A small detail caught my eye, when reading a paper that informally
introduced ADEV. In statistics, when calculating a variance over
a sample of a population the square-sum is divided by (n-1)(denoted by s in
statistics) instead of (n) (denoted by σ) in order to account for a small bias
the "standard" variance introduces
(c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation )
In almost all literature I have seen, ADEV is defined using an average,
i.e. dividing by (n) and very few use (n-1).

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

		Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

Hi, A small detail caught my eye, when reading a paper that informally introduced ADEV. In statistics, when calculating a variance over a sample of a population the square-sum is divided by (n-1)(denoted by s in statistics) instead of (n) (denoted by σ) in order to account for a small bias the "standard" variance introduces (c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation ) In almost all literature I have seen, ADEV is defined using an average, i.e. dividing by (n) and very few use (n-1). My question is two-fold: Why is (n) being used even though it's known to be an biased estimator? And why do people not use s when using (n-1)? Attila Kinali -- It is upon moral qualities that a society is ultimately founded. All the prosperity and technological sophistication in the world is of no use without that foundation. -- Miss Matheson, The Diamond Age, Neil Stephenson

MD

Magnus Danielson

Wed, Jan 4, 2017 9:13 PM

Hi Attila,

On 01/04/2017 09:12 PM, Attila Kinali wrote:

Hi,

A small detail caught my eye, when reading a paper that informally
introduced ADEV. In statistics, when calculating a variance over
a sample of a population the square-sum is divided by (n-1)(denoted by s in
statistics) instead of (n) (denoted by σ) in order to account for a small bias
the "standard" variance introduces
(c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation )
In almost all literature I have seen, ADEV is defined using an average,
i.e. dividing by (n) and very few use (n-1).

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

First off all, you need keep number of phase samples (N) or number
frequency samples (M) separate.

As you derivate the phase samples, you loose the phase bias from the
samples, so the remaining degree of freedom becomes one less. This is
the same as number of frequency samples, so any average will be (N-1)
which is the number of frequency samples M, so M=N-1 is motivated both ways.

Now, as you do an Allan Deviation/Variance estimator, you do second
derivation, so they the also the frequency bias gets derivated out, and
another degree of freedom is lost, so as you average you have only M-1
drift estimates which is what you average over, or N-2.

The ADEV core function is just the square of second derivate of phase,
and then you do an ensemble average over those squares.

No wonders the formulas become like these:
https://en.wikipedia.org/wiki/Allan_variance#Fixed_.CF.84_estimators

There is nothing magic really.

A hint for the use of s, consider the frequency stability. See Allan 1966.

Cheers,
Magnus

Hi Attila, On 01/04/2017 09:12 PM, Attila Kinali wrote: > Hi, > > A small detail caught my eye, when reading a paper that informally > introduced ADEV. In statistics, when calculating a variance over > a sample of a population the square-sum is divided by (n-1)(denoted by s in > statistics) instead of (n) (denoted by σ) in order to account for a small bias > the "standard" variance introduces > (c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation ) > In almost all literature I have seen, ADEV is defined using an average, > i.e. dividing by (n) and very few use (n-1). > > My question is two-fold: Why is (n) being used even though it's known > to be an biased estimator? And why do people not use s when using (n-1)? First off all, you need keep number of phase samples (N) or number frequency samples (M) separate. As you derivate the phase samples, you loose the phase bias from the samples, so the remaining degree of freedom becomes one less. This is the same as number of frequency samples, so any average will be (N-1) which is the number of frequency samples M, so M=N-1 is motivated both ways. Now, as you do an Allan Deviation/Variance estimator, you do second derivation, so they the also the frequency bias gets derivated out, and another degree of freedom is lost, so as you average you have only M-1 drift estimates which is what you average over, or N-2. The ADEV core function is just the square of second derivate of phase, and then you do an ensemble average over those squares. No wonders the formulas become like these: https://en.wikipedia.org/wiki/Allan_variance#Fixed_.CF.84_estimators There is nothing magic really. A hint for the use of s, consider the frequency stability. See Allan 1966. Cheers, Magnus

TV

Tom Van Baak

Thu, Jan 5, 2017 12:26 AM

Hi Attila,

The plain ADEV calculation is essentially a measure of unexpected or unwanted drift in frequency; which is the 1st difference of frequency error; the 2nd difference of phase error; the 3rd difference in clock time itself.

When measuring the quality of a clock, the key idea is that initial phase doesn't matter (you can always manually set the time), and even initial frequency doesn't matter (you can often adjust the rate: whether pendulum, quartz or atomic clock), and so a more honest measure of intrinsic timekeeper stability is its ability to maintain frequency; that is, statistically speaking, the lower the change in frequency, tau to tau, the better. Change in frequency is frequency drift.

If you have N phase samples, you get N-1 frequency samples and N-2 drift samples. The standard ADEV calculation is simply based on the mean of those drift samples. (and you know Hadamard takes this one step deeper).

If you look a the code at http://leapsecond.com/tools/adev_lib.c you'll see I avoid the confusing issue of N-1, N, N+1 and simply count the number of terms in the rms sum. Not only does that give the correct result but IMHO it make it clear what is being averaged. The code passes the official NBS ADEV sample suite, agrees with Bill's Stable32, is used in John's TimeLab, and also Mark's Lady Heather.

I've never quite understood the pedantic separation of "sample" and "population" mean that statistic textbooks and academics love to discuss. They clearly have never measured oscillators. In my experience if you think there's an important difference between N and N-1, then that's nature's way of telling you to go back to sleep and wait until tomorrow when you have more data. If your N is too small your ADEV wanders all over the place (TimeLab is good at displaying this in real-time) -- meaning that the distinction between sample (n-1) and population (n) mean is beyond ridiculous; even if there's a "correct" textbook answer.

/tvb

----- Original Message -----
From: "Attila Kinali" attila@kinali.ch
To: "Discussion of precise time and frequency measurement" time-nuts@febo.com
Sent: Wednesday, January 04, 2017 12:12 PM
Subject: [time-nuts] σ vs s in ADEV

Hi,

A small detail caught my eye, when reading a paper that informally
introduced ADEV. In statistics, when calculating a variance over
a sample of a population the square-sum is divided by (n-1)(denoted by s in
statistics) instead of (n) (denoted by σ) in order to account for a small bias
the "standard" variance introduces
(c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation )
In almost all literature I have seen, ADEV is defined using an average,
i.e. dividing by (n) and very few use (n-1).

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.

Hi Attila, The plain ADEV calculation is essentially a measure of unexpected or unwanted drift in frequency; which is the 1st difference of frequency error; the 2nd difference of phase error; the 3rd difference in clock time itself. When measuring the quality of a clock, the key idea is that initial phase doesn't matter (you can always manually set the time), and even initial frequency doesn't matter (you can often adjust the rate: whether pendulum, quartz or atomic clock), and so a more honest measure of intrinsic timekeeper stability is its ability to maintain frequency; that is, statistically speaking, the lower the change in frequency, tau to tau, the better. Change in frequency is frequency drift. If you have N phase samples, you get N-1 frequency samples and N-2 drift samples. The standard ADEV calculation is simply based on the mean of those drift samples. (and you know Hadamard takes this one step deeper). If you look a the code at http://leapsecond.com/tools/adev_lib.c you'll see I avoid the confusing issue of N-1, N, N+1 and simply count the number of terms in the rms sum. Not only does that give the correct result but IMHO it make it clear what is being averaged. The code passes the official NBS ADEV sample suite, agrees with Bill's Stable32, is used in John's TimeLab, and also Mark's Lady Heather. I've never quite understood the pedantic separation of "sample" and "population" mean that statistic textbooks and academics love to discuss. They clearly have never measured oscillators. In my experience if you think there's an important difference between N and N-1, then that's nature's way of telling you to go back to sleep and wait until tomorrow when you have more data. If your N is too small your ADEV wanders all over the place (TimeLab is good at displaying this in real-time) -- meaning that the distinction between sample (n-1) and population (n) mean is beyond ridiculous; even if there's a "correct" textbook answer. /tvb ----- Original Message ----- From: "Attila Kinali" <attila@kinali.ch> To: "Discussion of precise time and frequency measurement" <time-nuts@febo.com> Sent: Wednesday, January 04, 2017 12:12 PM Subject: [time-nuts] σ vs s in ADEV Hi, A small detail caught my eye, when reading a paper that informally introduced ADEV. In statistics, when calculating a variance over a sample of a population the square-sum is divided by (n-1)(denoted by s in statistics) instead of (n) (denoted by σ) in order to account for a small bias the "standard" variance introduces (c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation ) In almost all literature I have seen, ADEV is defined using an average, i.e. dividing by (n) and very few use (n-1). My question is two-fold: Why is (n) being used even though it's known to be an biased estimator? And why do people not use s when using (n-1)? Attila Kinali -- It is upon moral qualities that a society is ultimately founded. All the prosperity and technological sophistication in the world is of no use without that foundation. -- Miss Matheson, The Diamond Age, Neil Stephenson _______________________________________________ time-nuts mailing list -- time-nuts@febo.com To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts and follow the instructions there.

MD

Magnus Danielson

Thu, Jan 5, 2017 11:33 AM

Hi,

On 01/05/2017 01:26 AM, Tom Van Baak wrote:

Hi Attila,

The plain ADEV calculation is essentially a measure of unexpected or
unwanted drift in frequency; which is the 1st difference of frequency
error; the 2nd difference of phase error; the 3rd difference in clock
time itself.

ADEV is thus sensitive to linear drift, which becomes a limiting factor
for higher tau.

I can't see how clock time itself would integrate from phase. The time
of a clock is just an enumeration of phase. Phase is often presented in
a wrapped phase, but if you enumerate it is still just phase with larger
numbers, ADEV is still just 2nd difference away, not 3rd. It's actually
the time of x being used, not phase.

When measuring the quality of a clock, the key idea is that initial
phase doesn't matter (you can always manually set the time), and even
initial frequency doesn't matter (you can often adjust the rate:
whether pendulum, quartz or atomic clock), and so a more honest
measure of intrinsic timekeeper stability is its ability to maintain
frequency; that is, statistically speaking, the lower the change in
frequency, tau to tau, the better. Change in frequency is frequency
drift.

Due to the second difference, phase offset and frequency offset does not
affect the ADEV. Similarly for frequency measurement which is the first
difference, phase offset does not affect the frequency estimation.

If you have N phase samples, you get N-1 frequency samples and N-2
drift samples. The standard ADEV calculation is simply based on the
mean of those drift samples. (and you know Hadamard takes this one
step deeper).

If you look a the code at http://leapsecond.com/tools/adev_lib.c
you'll see I avoid the confusing issue of N-1, N, N+1 and simply
count the number of terms in the rms sum. Not only does that give the
correct result but IMHO it make it clear what is being averaged. The
code passes the official NBS ADEV sample suite, agrees with Bill's
Stable32, is used in John's TimeLab, and also Mark's Lady Heather.

The NIST 1000-point test-suite in NIST SP 1065 is recommended these days
as a test sequence. That's what I used to test all my implementations.

I've never quite understood the pedantic separation of "sample" and
"population" mean that statistic textbooks and academics love to
discuss. They clearly have never measured oscillators. In my
experience if you think there's an important difference between N and
N-1, then that's nature's way of telling you to go back to sleep and
wait until tomorrow when you have more data. If your N is too small
your ADEV wanders all over the place (TimeLab is good at displaying
this in real-time) -- meaning that the distinction between sample
(n-1) and population (n) mean is beyond ridiculous; even if there's a
"correct" textbook answer.

Traditional statistical textbooks only measure with white noise
disturbance for starters. What we do in ADEV and friends space is much
more complex. Traditional textbooks can get us up to speed with some of
the basics, but as we get flicker involved we are doomed. The
integration of the oscillator loop then give support for four noise
forms which is quite different.

So, the (n-1) and (n) issue is relevant when n is small and you have
white noise measurements. Compared to ADEV and friends you already get
the full degree of freedom and estimating it is trivial, it's (n-1)
which is why this is the average to use for standard deviation/variance.
That you can loose degrees of freedom due to how the noise interact with
the estimator is well beyond the textbooks. As you study these tools
more deeply, you essentially study advanced statistical methods.

After studying that I've become more particular about saying things like
estimator, bias functions, degrees of freedom and confidence intervals.

As the noiseforms work against us, we have to work hard to get high
degree of freedom for part of a measure, so that the confidence
intervals goes down. As we do that we either measure longer or use
another estimator with better performance. Some of these measures
introduce biases, but those can be worked out and compensated for, often
without too much effort.

Terms like deviation, variance, degrees of freedom, confidence interval
and estimator can be best learned in traditional statistics first. Then
you need to do the follow-up coarse for non-white noise statistics.

Cheers,
Magnus

Hi, On 01/05/2017 01:26 AM, Tom Van Baak wrote: > Hi Attila, > > The plain ADEV calculation is essentially a measure of unexpected or > unwanted drift in frequency; which is the 1st difference of frequency > error; the 2nd difference of phase error; the 3rd difference in clock > time itself. ADEV is thus sensitive to linear drift, which becomes a limiting factor for higher tau. I can't see how clock time itself would integrate from phase. The time of a clock is just an enumeration of phase. Phase is often presented in a wrapped phase, but if you enumerate it is still just phase with larger numbers, ADEV is still just 2nd difference away, not 3rd. It's actually the time of x being used, not phase. > When measuring the quality of a clock, the key idea is that initial > phase doesn't matter (you can always manually set the time), and even > initial frequency doesn't matter (you can often adjust the rate: > whether pendulum, quartz or atomic clock), and so a more honest > measure of intrinsic timekeeper stability is its ability to maintain > frequency; that is, statistically speaking, the lower the change in > frequency, tau to tau, the better. Change in frequency is frequency > drift. Due to the second difference, phase offset and frequency offset does not affect the ADEV. Similarly for frequency measurement which is the first difference, phase offset does not affect the frequency estimation. > If you have N phase samples, you get N-1 frequency samples and N-2 > drift samples. The standard ADEV calculation is simply based on the > mean of those drift samples. (and you know Hadamard takes this one > step deeper). > > If you look a the code at http://leapsecond.com/tools/adev_lib.c > you'll see I avoid the confusing issue of N-1, N, N+1 and simply > count the number of terms in the rms sum. Not only does that give the > correct result but IMHO it make it clear what is being averaged. The > code passes the official NBS ADEV sample suite, agrees with Bill's > Stable32, is used in John's TimeLab, and also Mark's Lady Heather. The NIST 1000-point test-suite in NIST SP 1065 is recommended these days as a test sequence. That's what I used to test all my implementations. > I've never quite understood the pedantic separation of "sample" and > "population" mean that statistic textbooks and academics love to > discuss. They clearly have never measured oscillators. In my > experience if you think there's an important difference between N and > N-1, then that's nature's way of telling you to go back to sleep and > wait until tomorrow when you have more data. If your N is too small > your ADEV wanders all over the place (TimeLab is good at displaying > this in real-time) -- meaning that the distinction between sample > (n-1) and population (n) mean is beyond ridiculous; even if there's a > "correct" textbook answer. Traditional statistical textbooks only measure with white noise disturbance for starters. What we do in ADEV and friends space is much more complex. Traditional textbooks can get us up to speed with some of the basics, but as we get flicker involved we are doomed. The integration of the oscillator loop then give support for four noise forms which is quite different. So, the (n-1) and (n) issue is relevant when n is small and you have white noise measurements. Compared to ADEV and friends you already get the full degree of freedom and estimating it is trivial, it's (n-1) which is why this is the average to use for standard deviation/variance. That you can loose degrees of freedom due to how the noise interact with the estimator is well beyond the textbooks. As you study these tools more deeply, you essentially study advanced statistical methods. After studying that I've become more particular about saying things like estimator, bias functions, degrees of freedom and confidence intervals. As the noiseforms work against us, we have to work hard to get high degree of freedom for part of a measure, so that the confidence intervals goes down. As we do that we either measure longer or use another estimator with better performance. Some of these measures introduce biases, but those can be worked out and compensated for, often without too much effort. Terms like deviation, variance, degrees of freedom, confidence interval and estimator can be best learned in traditional statistics first. Then you need to do the follow-up coarse for non-white noise statistics. Cheers, Magnus

BC

Bob Camp

Thu, Jan 5, 2017 12:19 PM

HI

On Jan 5, 2017, at 6:33 AM, Magnus Danielson magnus@rubidium.dyndns.org wrote:

Hi,

On 01/05/2017 01:26 AM, Tom Van Baak wrote:

Hi Attila,

The plain ADEV calculation is essentially a measure of unexpected or
unwanted drift in frequency; which is the 1st difference of frequency
error; the 2nd difference of phase error; the 3rd difference in clock
time itself.

ADEV is thus sensitive to linear drift, which becomes a limiting factor for higher tau.

Which is why the standard verbal description of ADEV always includes the qualifier
“drift corrected”. If drift is not removed from the data, ADEV is not doing what it should.
This gets overlooked when we take ADEV straight off of a cool piece of gear that is unable
to properly / automatically remove the drift.

Bob

I can't see how clock time itself would integrate from phase. The time of a clock is just an enumeration of phase. Phase is often presented in a wrapped phase, but if you enumerate it is still just phase with larger numbers, ADEV is still just 2nd difference away, not 3rd. It's actually the time of x being used, not phase.

When measuring the quality of a clock, the key idea is that initial
phase doesn't matter (you can always manually set the time), and even
initial frequency doesn't matter (you can often adjust the rate:
whether pendulum, quartz or atomic clock), and so a more honest
measure of intrinsic timekeeper stability is its ability to maintain
frequency; that is, statistically speaking, the lower the change in
frequency, tau to tau, the better. Change in frequency is frequency
drift.

Due to the second difference, phase offset and frequency offset does not affect the ADEV. Similarly for frequency measurement which is the first difference, phase offset does not affect the frequency estimation.

If you have N phase samples, you get N-1 frequency samples and N-2
drift samples. The standard ADEV calculation is simply based on the
mean of those drift samples. (and you know Hadamard takes this one
step deeper).

If you look a the code at http://leapsecond.com/tools/adev_lib.c
you'll see I avoid the confusing issue of N-1, N, N+1 and simply
count the number of terms in the rms sum. Not only does that give the
correct result but IMHO it make it clear what is being averaged. The
code passes the official NBS ADEV sample suite, agrees with Bill's
Stable32, is used in John's TimeLab, and also Mark's Lady Heather.

The NIST 1000-point test-suite in NIST SP 1065 is recommended these days
as a test sequence. That's what I used to test all my implementations.

I've never quite understood the pedantic separation of "sample" and
"population" mean that statistic textbooks and academics love to
discuss. They clearly have never measured oscillators. In my
experience if you think there's an important difference between N and

HI > On Jan 5, 2017, at 6:33 AM, Magnus Danielson <magnus@rubidium.dyndns.org> wrote: > > Hi, > > On 01/05/2017 01:26 AM, Tom Van Baak wrote: >> Hi Attila, >> >> The plain ADEV calculation is essentially a measure of unexpected or >> unwanted drift in frequency; which is the 1st difference of frequency >> error; the 2nd difference of phase error; the 3rd difference in clock >> time itself. > > ADEV is thus sensitive to linear drift, which becomes a limiting factor for higher tau. > Which is *why* the standard verbal description of ADEV always includes the qualifier “drift corrected”. If drift is not removed from the data, ADEV is not doing what it should. This gets overlooked when we take ADEV straight off of a cool piece of gear that is unable to properly / automatically remove the drift. Bob > I can't see how clock time itself would integrate from phase. The time of a clock is just an enumeration of phase. Phase is often presented in a wrapped phase, but if you enumerate it is still just phase with larger numbers, ADEV is still just 2nd difference away, not 3rd. It's actually the time of x being used, not phase. > >> When measuring the quality of a clock, the key idea is that initial >> phase doesn't matter (you can always manually set the time), and even >> initial frequency doesn't matter (you can often adjust the rate: >> whether pendulum, quartz or atomic clock), and so a more honest >> measure of intrinsic timekeeper stability is its ability to maintain >> frequency; that is, statistically speaking, the lower the change in >> frequency, tau to tau, the better. Change in frequency is frequency >> drift. > > Due to the second difference, phase offset and frequency offset does not affect the ADEV. Similarly for frequency measurement which is the first difference, phase offset does not affect the frequency estimation. > >> If you have N phase samples, you get N-1 frequency samples and N-2 >> drift samples. The standard ADEV calculation is simply based on the >> mean of those drift samples. (and you know Hadamard takes this one >> step deeper). >> >> If you look a the code at http://leapsecond.com/tools/adev_lib.c >> you'll see I avoid the confusing issue of N-1, N, N+1 and simply >> count the number of terms in the rms sum. Not only does that give the >> correct result but IMHO it make it clear what is being averaged. The >> code passes the official NBS ADEV sample suite, agrees with Bill's >> Stable32, is used in John's TimeLab, and also Mark's Lady Heather. > > The NIST 1000-point test-suite in NIST SP 1065 is recommended these days > as a test sequence. That's what I used to test all my implementations. > >> I've never quite understood the pedantic separation of "sample" and >> "population" mean that statistic textbooks and academics love to >> discuss. They clearly have never measured oscillators. In my >> experience if you think there's an important difference between N and

WH

William H. Fite

Thu, Jan 5, 2017 5:27 PM

Professional statistician here.

Your explanation is clear and lucid, in contrast to some earlier attempts
here. I agree that with oscillators the distinction between N and N-1 is
not particularly relevant. I must caution you, however, not to be too
dismissive of the difference between the two. There are excellent reasons
for the "pedantic" distinction between samples and populations, especially
in small-sample work and when extrapolations are involved. A mistake by
NASA between N and n might mean putting the lander on Mars or missing it by
100,000km. You may count that distinction "beyond ridiculous" but it isn't
going away because, for mamy applications, it is absolutely critical.

Many thanks for your invaluable contributions in your field over the year.

Bill (PhD, as if that mattered)

On Wednesday, January 4, 2017, Tom Van Baak tvb@leapsecond.com wrote:

Hi Attila,

The plain ADEV calculation is essentially a measure of unexpected or
unwanted drift in frequency; which is the 1st difference of frequency
error; the 2nd difference of phase error; the 3rd difference in clock time
itself.

When measuring the quality of a clock, the key idea is that initial phase
doesn't matter (you can always manually set the time), and even initial
frequency doesn't matter (you can often adjust the rate: whether pendulum,
quartz or atomic clock), and so a more honest measure of intrinsic
timekeeper stability is its ability to maintain frequency; that is,
statistically speaking, the lower the change in frequency, tau to tau, the
better. Change in frequency is frequency drift.

If you have N phase samples, you get N-1 frequency samples and N-2 drift
samples. The standard ADEV calculation is simply based on the mean of those
drift samples. (and you know Hadamard takes this one step deeper).

If you look a the code at http://leapsecond.com/tools/adev_lib.c you'll
see I avoid the confusing issue of N-1, N, N+1 and simply count the number
of terms in the rms sum. Not only does that give the correct result but
IMHO it make it clear what is being averaged. The code passes the official
NBS ADEV sample suite, agrees with Bill's Stable32, is used in John's
TimeLab, and also Mark's Lady Heather.

I've never quite understood the pedantic separation of "sample" and
"population" mean that statistic textbooks and academics love to discuss.
They clearly have never measured oscillators. In my experience if you think
there's an important difference between N and N-1, then that's nature's way
of telling you to go back to sleep and wait until tomorrow when you have
more data. If your N is too small your ADEV wanders all over the place
(TimeLab is good at displaying this in real-time) -- meaning that the
distinction between sample (n-1) and population (n) mean is beyond
ridiculous; even if there's a "correct" textbook answer.

/tvb

----- Original Message -----
From: "Attila Kinali" <attila@kinali.ch javascript:;>
To: "Discussion of precise time and frequency measurement" <
time-nuts@febo.com javascript:;>
Sent: Wednesday, January 04, 2017 12:12 PM
Subject: [time-nuts] σ vs s in ADEV

Hi,

A small detail caught my eye, when reading a paper that informally
introduced ADEV. In statistics, when calculating a variance over
a sample of a population the square-sum is divided by (n-1)(denoted by s in
statistics) instead of (n) (denoted by σ) in order to account for a small
bias
the "standard" variance introduces
(c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_
standard_deviation )
In almost all literature I have seen, ADEV is defined using an average,
i.e. dividing by (n) and very few use (n-1).

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

time-nuts mailing list -- time-nuts@febo.com javascript:;
To unsubscribe, go to https://www.febo.com/cgi-bin/
mailman/listinfo/time-nuts
and follow the instructions there.

--
If you gaze long into an abyss, your coffee will get cold.

Professional statistician here. Your explanation is clear and lucid, in contrast to some earlier attempts here. I agree that with oscillators the distinction between N and N-1 is not particularly relevant. I must caution you, however, not to be too dismissive of the difference between the two. There are excellent reasons for the "pedantic" distinction between samples and populations, especially in small-sample work and when extrapolations are involved. A mistake by NASA between N and n might mean putting the lander on Mars or missing it by 100,000km. You may count that distinction "beyond ridiculous" but it isn't going away because, for mamy applications, it is absolutely critical. Many thanks for your invaluable contributions in your field over the year. Bill (PhD, as if that mattered) On Wednesday, January 4, 2017, Tom Van Baak <tvb@leapsecond.com> wrote: > Hi Attila, > > The plain ADEV calculation is essentially a measure of unexpected or > unwanted drift in frequency; which is the 1st difference of frequency > error; the 2nd difference of phase error; the 3rd difference in clock time > itself. > > When measuring the quality of a clock, the key idea is that initial phase > doesn't matter (you can always manually set the time), and even initial > frequency doesn't matter (you can often adjust the rate: whether pendulum, > quartz or atomic clock), and so a more honest measure of intrinsic > timekeeper stability is its ability to maintain frequency; that is, > statistically speaking, the lower the change in frequency, tau to tau, the > better. Change in frequency is frequency drift. > > If you have N phase samples, you get N-1 frequency samples and N-2 drift > samples. The standard ADEV calculation is simply based on the mean of those > drift samples. (and you know Hadamard takes this one step deeper). > > If you look a the code at http://leapsecond.com/tools/adev_lib.c you'll > see I avoid the confusing issue of N-1, N, N+1 and simply count the number > of terms in the rms sum. Not only does that give the correct result but > IMHO it make it clear what is being averaged. The code passes the official > NBS ADEV sample suite, agrees with Bill's Stable32, is used in John's > TimeLab, and also Mark's Lady Heather. > > I've never quite understood the pedantic separation of "sample" and > "population" mean that statistic textbooks and academics love to discuss. > They clearly have never measured oscillators. In my experience if you think > there's an important difference between N and N-1, then that's nature's way > of telling you to go back to sleep and wait until tomorrow when you have > more data. If your N is too small your ADEV wanders all over the place > (TimeLab is good at displaying this in real-time) -- meaning that the > distinction between sample (n-1) and population (n) mean is beyond > ridiculous; even if there's a "correct" textbook answer. > > /tvb > > ----- Original Message ----- > From: "Attila Kinali" <attila@kinali.ch <javascript:;>> > To: "Discussion of precise time and frequency measurement" < > time-nuts@febo.com <javascript:;>> > Sent: Wednesday, January 04, 2017 12:12 PM > Subject: [time-nuts] σ vs s in ADEV > > > Hi, > > A small detail caught my eye, when reading a paper that informally > introduced ADEV. In statistics, when calculating a variance over > a sample of a population the square-sum is divided by (n-1)(denoted by s in > statistics) instead of (n) (denoted by σ) in order to account for a small > bias > the "standard" variance introduces > (c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_ > standard_deviation ) > In almost all literature I have seen, ADEV is defined using an average, > i.e. dividing by (n) and very few use (n-1). > > My question is two-fold: Why is (n) being used even though it's known > to be an biased estimator? And why do people not use s when using (n-1)? > > Attila Kinali > > -- > It is upon moral qualities that a society is ultimately founded. All > the prosperity and technological sophistication in the world is of no > use without that foundation. > -- Miss Matheson, The Diamond Age, Neil Stephenson > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com <javascript:;> > To unsubscribe, go to https://www.febo.com/cgi-bin/ > mailman/listinfo/time-nuts > and follow the instructions there. > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com <javascript:;> > To unsubscribe, go to https://www.febo.com/cgi-bin/ > mailman/listinfo/time-nuts > and follow the instructions there. > -- If you gaze long into an abyss, your coffee will get cold.

AK

Attila Kinali

Mon, Jan 9, 2017 6:18 PM

God kväll Magnus,

On Wed, 4 Jan 2017 22:13:04 +0100
Magnus Danielson magnus@rubidium.dyndns.org wrote:

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

First off all, you need keep number of phase samples (N) or number
frequency samples (M) separate.

As you derivate the phase samples, you loose the phase bias from the
samples, so the remaining degree of freedom becomes one less. This is
the same as number of frequency samples, so any average will be (N-1)
which is the number of frequency samples M, so M=N-1 is motivated both ways.

Now, as you do an Allan Deviation/Variance estimator, you do second
derivation, so they the also the frequency bias gets derivated out, and
another degree of freedom is lost, so as you average you have only M-1
drift estimates which is what you average over, or N-2.

My statistics is still pretty weak, but I think that the degree of freedom,
as you use it here, does not matter.

The sums of the formulas in [1] and [2] are over (M-1) and (N-2) elements,
respectively. The sums are then divided by (M-1) and (N-2) as well.
Which means we are in the case of σ, ie division by (n) and not (n-1) as it
would be the case for s.

The ADEV core function is just the square of second derivate of phase,
and then you do an ensemble average over those squares.

Yes.

A hint for the use of s, consider the frequency stability. See Allan 1966.

I guess you are refering to [3]. Yes Allan does give tables on the expected
difference of variance for some types of noise, but not explicitly on why
σ and not s is being used.

			Attila Kinali

[1] https://en.wikipedia.org/wiki/Allan_variance#Fixed_.CF.84_estimators

[2] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065,
by W.J. Riley, 2008
http://tf.nist.gov/timefreq/general/pdf/2220.pdf

[3] "Statistics of Atomic Frequency Standards", by David Allan, 1966

--
Malek's Law:
Any simple idea will be worded in the most complicated way.

God kväll Magnus, On Wed, 4 Jan 2017 22:13:04 +0100 Magnus Danielson <magnus@rubidium.dyndns.org> wrote: > > My question is two-fold: Why is (n) being used even though it's known > > to be an biased estimator? And why do people not use s when using (n-1)? > > First off all, you need keep number of phase samples (N) or number > frequency samples (M) separate. > > As you derivate the phase samples, you loose the phase bias from the > samples, so the remaining degree of freedom becomes one less. This is > the same as number of frequency samples, so any average will be (N-1) > which is the number of frequency samples M, so M=N-1 is motivated both ways. > > Now, as you do an Allan Deviation/Variance estimator, you do second > derivation, so they the also the frequency bias gets derivated out, and > another degree of freedom is lost, so as you average you have only M-1 > drift estimates which is what you average over, or N-2. My statistics is still pretty weak, but I think that the degree of freedom, as you use it here, does not matter. The sums of the formulas in [1] and [2] are over (M-1) and (N-2) elements, respectively. The sums are then divided by (M-1) and (N-2) as well. Which means we are in the case of σ, ie division by (n) and not (n-1) as it would be the case for s. > The ADEV core function is just the square of second derivate of phase, > and then you do an ensemble average over those squares. Yes. > A hint for the use of s, consider the frequency stability. See Allan 1966. I guess you are refering to [3]. Yes Allan does give tables on the expected difference of variance for some types of noise, but not explicitly on why σ and not s is being used. Attila Kinali [1] https://en.wikipedia.org/wiki/Allan_variance#Fixed_.CF.84_estimators [2] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065, by W.J. Riley, 2008 http://tf.nist.gov/timefreq/general/pdf/2220.pdf [3] "Statistics of Atomic Frequency Standards", by David Allan, 1966 -- Malek's Law: Any simple idea will be worded in the most complicated way.

AK

Attila Kinali

Mon, Jan 9, 2017 6:28 PM

Hoi Tom,

On Wed, 4 Jan 2017 16:26:22 -0800
"Tom Van Baak" tvb@LeapSecond.com wrote:

I've never quite understood the pedantic separation of "sample" and
"population" mean that statistic textbooks and academics love to discuss.

For me it's a matter of being exact. If there is one thing I've learned
in the last two years during my PhD then it is that we engineers are
way too often too sloppy about our notation and about the assumption
under which the used model/formulas hold.

They clearly have never measured oscillators. In my experience if you think
there's an important difference between N and N-1, then that's nature's way
of telling you to go back to sleep and wait until tomorrow when you have
more data. If your N is too small your ADEV wanders all over the place
(TimeLab is good at displaying this in real-time) -- meaning that the
distinction between sample (n-1) and population (n) mean is beyond
ridiculous; even if there's a "correct" textbook answer.

Not really. If you are looking at very long taus, then letting it run
for another day will not do. You'd have to run for weeks or months
to get enough samples. And no, overlapping ADEV does not necessarily
help as for the longer taus the dominating noise is not white phase noise
anymore and thus has some long term correlation. I.e. oADEV will not work
as expected as two samples that are close time-wise will also have a large
correlation between them.

I have not had a look at the formula used to estimate the error bars for
ADEV in TimeLab, but I wouldn't be surprised if it underestimates the error
for certain kind of noise processes. There are a lot of assumptions about
the type of noise from atomic clocks and I am not sure anymore that even
the most basic one (that the noise is Gaussian, even for the 1/f^a noise)
holds in all cases.

			Attila Kinali

--
Malek's Law:
Any simple idea will be worded in the most complicated way.

Hoi Tom, On Wed, 4 Jan 2017 16:26:22 -0800 "Tom Van Baak" <tvb@LeapSecond.com> wrote: > I've never quite understood the pedantic separation of "sample" and > "population" mean that statistic textbooks and academics love to discuss. For me it's a matter of being exact. If there is one thing I've learned in the last two years during my PhD then it is that we engineers are way too often too sloppy about our notation and about the assumption under which the used model/formulas hold. > They clearly have never measured oscillators. In my experience if you think > there's an important difference between N and N-1, then that's nature's way > of telling you to go back to sleep and wait until tomorrow when you have > more data. If your N is too small your ADEV wanders all over the place > (TimeLab is good at displaying this in real-time) -- meaning that the > distinction between sample (n-1) and population (n) mean is beyond > ridiculous; even if there's a "correct" textbook answer. Not really. If you are looking at very long taus, then letting it run for another day will not do. You'd have to run for weeks or months to get enough samples. And no, overlapping ADEV does not necessarily help as for the longer taus the dominating noise is not white phase noise anymore and thus has some long term correlation. I.e. oADEV will not work as expected as two samples that are close time-wise will also have a large correlation between them. I have not had a look at the formula used to estimate the error bars for ADEV in TimeLab, but I wouldn't be surprised if it underestimates the error for certain kind of noise processes. There are a lot of assumptions about the type of noise from atomic clocks and I am not sure anymore that even the most basic one (that the noise is Gaussian, even for the 1/f^a noise) holds in all cases. Attila Kinali -- Malek's Law: Any simple idea will be worded in the most complicated way.

SS

Scott Stobbe

Mon, Jan 9, 2017 6:41 PM

I could be wrong here, but it is my understanding that Allan's pioneering
work was in response to finding a statistic which is convergent to 1/f
noise. Ordinary standard deviation is not convergent to 1/f processes. So I
don't know that trying to compare the two is wise. Disclaimer: I could be
totally wrong, if someone has better grasp on how the allan deviation came
to be, please correct me.

On Wed, Jan 4, 2017 at 3:12 PM, Attila Kinali attila@kinali.ch wrote:

Hi,

A small detail caught my eye, when reading a paper that informally
introduced ADEV. In statistics, when calculating a variance over
a sample of a population the square-sum is divided by (n-1)(denoted by s in
statistics) instead of (n) (denoted by σ) in order to account for a small
bias
the "standard" variance introduces
(c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_
standard_deviation )
In almost all literature I have seen, ADEV is defined using an average,
i.e. dividing by (n) and very few use (n-1).

My question is two-fold: Why is (n) being used even though it's known
to be an biased estimator? And why do people not use s when using (n-1)?

                     Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/
mailman/listinfo/time-nuts
and follow the instructions there.

I could be wrong here, but it is my understanding that Allan's pioneering work was in response to finding a statistic which is convergent to 1/f noise. Ordinary standard deviation is not convergent to 1/f processes. So I don't know that trying to compare the two is wise. Disclaimer: I could be totally wrong, if someone has better grasp on how the allan deviation came to be, please correct me. On Wed, Jan 4, 2017 at 3:12 PM, Attila Kinali <attila@kinali.ch> wrote: > Hi, > > A small detail caught my eye, when reading a paper that informally > introduced ADEV. In statistics, when calculating a variance over > a sample of a population the square-sum is divided by (n-1)(denoted by s in > statistics) instead of (n) (denoted by σ) in order to account for a small > bias > the "standard" variance introduces > (c.f. https://en.wikipedia.org/wiki/Unbiased_estimation_of_ > standard_deviation ) > In almost all literature I have seen, ADEV is defined using an average, > i.e. dividing by (n) and very few use (n-1). > > My question is two-fold: Why is (n) being used even though it's known > to be an biased estimator? And why do people not use s when using (n-1)? > > Attila Kinali > > -- > It is upon moral qualities that a society is ultimately founded. All > the prosperity and technological sophistication in the world is of no > use without that foundation. > -- Miss Matheson, The Diamond Age, Neil Stephenson > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com > To unsubscribe, go to https://www.febo.com/cgi-bin/ > mailman/listinfo/time-nuts > and follow the instructions there. >

AK

Attila Kinali

Mon, Jan 9, 2017 6:45 PM

On Mon, 9 Jan 2017 13:41:34 -0500
Scott Stobbe scott.j.stobbe@gmail.com wrote:

I could be wrong here, but it is my understanding that Allan's pioneering
work was in response to finding a statistic which is convergent to 1/f
noise. Ordinary standard deviation is not convergent to 1/f processes. So I
don't know that trying to compare the two is wise. Disclaimer: I could be
totally wrong, if someone has better grasp on how the allan deviation came
to be, please correct me.

Yes, this is basically where it all started from. [1, section 5.2.1] gives
a short summary of the problem.

		Attila Kinali

[1] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065,
by W.J. Riley, 2008
http://tf.nist.gov/timefreq/general/pdf/2220.pdf

Malek's Law:
Any simple idea will be worded in the most complicated way.

On Mon, 9 Jan 2017 13:41:34 -0500 Scott Stobbe <scott.j.stobbe@gmail.com> wrote: > I could be wrong here, but it is my understanding that Allan's pioneering > work was in response to finding a statistic which is convergent to 1/f > noise. Ordinary standard deviation is not convergent to 1/f processes. So I > don't know that trying to compare the two is wise. Disclaimer: I could be > totally wrong, if someone has better grasp on how the allan deviation came > to be, please correct me. Yes, this is basically where it all started from. [1, section 5.2.1] gives a short summary of the problem. Attila Kinali [1] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065, by W.J. Riley, 2008 http://tf.nist.gov/timefreq/general/pdf/2220.pdf -- Malek's Law: Any simple idea will be worded in the most complicated way.

time-nuts@lists.febo.com

σ vs s in ADEV

[1] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065, by W.J. Riley, 2008 http://tf.nist.gov/timefreq/general/pdf/2220.pdf

[1] "Handbook of Frequency Stability Analysis" NIST Special Pub 1065,
by W.J. Riley, 2008
http://tf.nist.gov/timefreq/general/pdf/2220.pdf