time-nuts@lists.febo.com

Discussion of precise time and frequency measurement

View all threads

Allan variance by sine-wave fitting

RD
Ralph Devoe
Mon, Nov 27, 2017 5:33 AM

Here's a short reply to the comments of Bob, Attila, Magnus, and others.
Thanks for reading the paper carefully. I appreciate it. Some of the
comments are quite interesting, other seem off the mark. Let's start with
an interesting one:

The issue I intended to raise, but which I'm not sure I stated clearly
enough, is a conjecture: Is least-square fitting as efficient as any of the
other direct-digital or SDR techniques? Is the resolution of any
direct-digital system limited by (a) the effective number of bits of the
ADC and (b) the number of samples averaged? Thanks to Attila for reminding
me of the Sherman and Joerdens paper, which I have not read carefully
before. In their appendix Eq. A6 they derive a result which may or may not
be related to Eq. 6 in my paper. If the conjecture is true then the SDR
technique must be viewed as one several equivalent algorithms for
estimating phase. Note that the time deviation for a single ADC channel in
the Sherman and Joerdens paper in Fig. 3c is about the same as my value.
This suggests that the conjecture is true.

Other criticisms seem off the mark:

Several people raised the question of the filter factor of the least-square
fit.  First, if there is a filtering bias due to the fit, it would be the
same for signal and reference channels and should cancel. Second, even if
there is a bias, it would have to fluctuate from second to second to cause
a frequency error. Third, the Monte Carlo results show no bias. The output
of the Monte Carlo system is the difference between the fit result and the
known MC input. Any fitting bias would show up in the difference, but there
is none.

Attila says that I exaggerate the difficulty of programming an FPGA. Not
so. At work we give experts 1-6 months for a new FPGA design. We recently
ported some code from a Spartan 3 to a Spartan 6. Months of debugging
followed. FPGA's will always be faster and more computationally efficient
than Python, but Python is fast enough. The motivation for this experiment
was to use a high-level language (Python) and preexisting firmware and
software (Digilent) so that the device could be set up and reconfigured
easily, leaving more time to think about the important issues.

Attila has about a dozen criticisms of the theory section, mostly that it
is not rigorous enough and there are many assumptions. But it is not
intended to be rigorous. This is primarily an experimental paper and the
purpose of the theory is to give a simple physical picture of the
surprizingly good results. It does that, and the experimental results
support the conjecture above.
The limitations of the theory are discussed in detail on p. 6 where it is
called "... a convenient approximation.." Despite this the theory agrees
with the Monte Carlo over most of parameter space, and where it does not is
discussed in the text.

About units: I'm a physicist and normally use c.g.s units for
electromagnetic calculations. The paper was submitted to Rev. Sci. Instr.
which is an APS journal. The APS has no restrictions on units at all.
Obviously for clarity I should put them in SI units when possible.

Ralph
KM6IYN

Here's a short reply to the comments of Bob, Attila, Magnus, and others. Thanks for reading the paper carefully. I appreciate it. Some of the comments are quite interesting, other seem off the mark. Let's start with an interesting one: The issue I intended to raise, but which I'm not sure I stated clearly enough, is a conjecture: Is least-square fitting as efficient as any of the other direct-digital or SDR techniques? Is the resolution of any direct-digital system limited by (a) the effective number of bits of the ADC and (b) the number of samples averaged? Thanks to Attila for reminding me of the Sherman and Joerdens paper, which I have not read carefully before. In their appendix Eq. A6 they derive a result which may or may not be related to Eq. 6 in my paper. If the conjecture is true then the SDR technique must be viewed as one several equivalent algorithms for estimating phase. Note that the time deviation for a single ADC channel in the Sherman and Joerdens paper in Fig. 3c is about the same as my value. This suggests that the conjecture is true. Other criticisms seem off the mark: Several people raised the question of the filter factor of the least-square fit. First, if there is a filtering bias due to the fit, it would be the same for signal and reference channels and should cancel. Second, even if there is a bias, it would have to fluctuate from second to second to cause a frequency error. Third, the Monte Carlo results show no bias. The output of the Monte Carlo system is the difference between the fit result and the known MC input. Any fitting bias would show up in the difference, but there is none. Attila says that I exaggerate the difficulty of programming an FPGA. Not so. At work we give experts 1-6 months for a new FPGA design. We recently ported some code from a Spartan 3 to a Spartan 6. Months of debugging followed. FPGA's will always be faster and more computationally efficient than Python, but Python is fast enough. The motivation for this experiment was to use a high-level language (Python) and preexisting firmware and software (Digilent) so that the device could be set up and reconfigured easily, leaving more time to think about the important issues. Attila has about a dozen criticisms of the theory section, mostly that it is not rigorous enough and there are many assumptions. But it is not intended to be rigorous. This is primarily an experimental paper and the purpose of the theory is to give a simple physical picture of the surprizingly good results. It does that, and the experimental results support the conjecture above. The limitations of the theory are discussed in detail on p. 6 where it is called "... a convenient approximation.." Despite this the theory agrees with the Monte Carlo over most of parameter space, and where it does not is discussed in the text. About units: I'm a physicist and normally use c.g.s units for electromagnetic calculations. The paper was submitted to Rev. Sci. Instr. which is an APS journal. The APS has no restrictions on units at all. Obviously for clarity I should put them in SI units when possible. Ralph KM6IYN
BK
Bob kb8tq
Mon, Nov 27, 2017 3:05 PM

Hi

On Nov 27, 2017, at 12:33 AM, Ralph Devoe rgdevoe@gmail.com wrote:

Here's a short reply to the comments of Bob, Attila, Magnus, and others.
Thanks for reading the paper carefully. I appreciate it. Some of the
comments are quite interesting, other seem off the mark. Let's start with
an interesting one:

The issue I intended to raise, but which I'm not sure I stated clearly
enough, is a conjecture: Is least-square fitting as efficient as any of the
other direct-digital or SDR techniques? Is the resolution of any
direct-digital system limited by (a) the effective number of bits of the
ADC and (b) the number of samples averaged? Thanks to Attila for reminding
me of the Sherman and Joerdens paper, which I have not read carefully
before. In their appendix Eq. A6 they derive a result which may or may not
be related to Eq. 6 in my paper. If the conjecture is true then the SDR
technique must be viewed as one several equivalent algorithms for
estimating phase. Note that the time deviation for a single ADC channel in
the Sherman and Joerdens paper in Fig. 3c is about the same as my value.
This suggests that the conjecture is true.

Other criticisms seem off the mark:

Several people raised the question of the filter factor of the least-square
fit.  First, if there is a filtering bias due to the fit, it would be the
same for signal and reference channels and should cancel.

Errr … no.

There are earlier posts about this on the list. The objective of ADEV is to capture
noise. Any filtering process rejects noise. That is true in DMTD and all the other approaches.
Presentations made in papers since the 1970’s demonstrate that it very much does
not cancel out or drop out. It impacts the number you get for ADEV. You have thrown away
part of what you set out to measure.

Yes, ADEV is a bit fussy in this regard. Many of the other “DEV” measurements are also
fussy. This is at the heart of why many counters (when they estimate frequency) can not
be used directly for ADEV. Any technique that is proposed for ADEV needs to be analyzed.

The point here is not that filtering makes the measurement invalid. The point is that the
filter’s impact needs to be evaluated and stated. That is the key part of the proposed technique
that is missing at this point.

Bob

Second, even if
there is a bias, it would have to fluctuate from second to second to cause
a frequency error. Third, the Monte Carlo results show no bias. The output
of the Monte Carlo system is the difference between the fit result and the
known MC input. Any fitting bias would show up in the difference, but there
is none.

Attila says that I exaggerate the difficulty of programming an FPGA. Not
so. At work we give experts 1-6 months for a new FPGA design. We recently
ported some code from a Spartan 3 to a Spartan 6. Months of debugging
followed. FPGA's will always be faster and more computationally efficient
than Python, but Python is fast enough. The motivation for this experiment
was to use a high-level language (Python) and preexisting firmware and
software (Digilent) so that the device could be set up and reconfigured
easily, leaving more time to think about the important issues.

Attila has about a dozen criticisms of the theory section, mostly that it
is not rigorous enough and there are many assumptions. But it is not
intended to be rigorous. This is primarily an experimental paper and the
purpose of the theory is to give a simple physical picture of the
surprizingly good results. It does that, and the experimental results
support the conjecture above.
The limitations of the theory are discussed in detail on p. 6 where it is
called "... a convenient approximation.." Despite this the theory agrees
with the Monte Carlo over most of parameter space, and where it does not is
discussed in the text.

About units: I'm a physicist and normally use c.g.s units for
electromagnetic calculations. The paper was submitted to Rev. Sci. Instr.
which is an APS journal. The APS has no restrictions on units at all.
Obviously for clarity I should put them in SI units when possible.

Ralph
KM6IYN


time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.

Hi > On Nov 27, 2017, at 12:33 AM, Ralph Devoe <rgdevoe@gmail.com> wrote: > > Here's a short reply to the comments of Bob, Attila, Magnus, and others. > Thanks for reading the paper carefully. I appreciate it. Some of the > comments are quite interesting, other seem off the mark. Let's start with > an interesting one: > > The issue I intended to raise, but which I'm not sure I stated clearly > enough, is a conjecture: Is least-square fitting as efficient as any of the > other direct-digital or SDR techniques? Is the resolution of any > direct-digital system limited by (a) the effective number of bits of the > ADC and (b) the number of samples averaged? Thanks to Attila for reminding > me of the Sherman and Joerdens paper, which I have not read carefully > before. In their appendix Eq. A6 they derive a result which may or may not > be related to Eq. 6 in my paper. If the conjecture is true then the SDR > technique must be viewed as one several equivalent algorithms for > estimating phase. Note that the time deviation for a single ADC channel in > the Sherman and Joerdens paper in Fig. 3c is about the same as my value. > This suggests that the conjecture is true. > > Other criticisms seem off the mark: > > Several people raised the question of the filter factor of the least-square > fit. First, if there is a filtering bias due to the fit, it would be the > same for signal and reference channels and should cancel. Errr … no. There are earlier posts about this on the list. The *objective* of ADEV is to capture noise. Any filtering process rejects noise. That is true in DMTD and all the other approaches. Presentations made in papers since the 1970’s demonstrate that it very much does not cancel out or drop out. It impacts the number you get for ADEV. You have thrown away part of what you set out to measure. Yes, ADEV is a bit fussy in this regard. Many of the other “DEV” measurements are also fussy. This is at the heart of why many counters (when they estimate frequency) can not be used directly for ADEV. Any technique that is proposed for ADEV needs to be analyzed. The point here is not that filtering makes the measurement invalid. The point is that the filter’s impact needs to be evaluated and stated. That is the key part of the proposed technique that is missing at this point. Bob > Second, even if > there is a bias, it would have to fluctuate from second to second to cause > a frequency error. Third, the Monte Carlo results show no bias. The output > of the Monte Carlo system is the difference between the fit result and the > known MC input. Any fitting bias would show up in the difference, but there > is none. > > Attila says that I exaggerate the difficulty of programming an FPGA. Not > so. At work we give experts 1-6 months for a new FPGA design. We recently > ported some code from a Spartan 3 to a Spartan 6. Months of debugging > followed. FPGA's will always be faster and more computationally efficient > than Python, but Python is fast enough. The motivation for this experiment > was to use a high-level language (Python) and preexisting firmware and > software (Digilent) so that the device could be set up and reconfigured > easily, leaving more time to think about the important issues. > > Attila has about a dozen criticisms of the theory section, mostly that it > is not rigorous enough and there are many assumptions. But it is not > intended to be rigorous. This is primarily an experimental paper and the > purpose of the theory is to give a simple physical picture of the > surprizingly good results. It does that, and the experimental results > support the conjecture above. > The limitations of the theory are discussed in detail on p. 6 where it is > called "... a convenient approximation.." Despite this the theory agrees > with the Monte Carlo over most of parameter space, and where it does not is > discussed in the text. > > About units: I'm a physicist and normally use c.g.s units for > electromagnetic calculations. The paper was submitted to Rev. Sci. Instr. > which is an APS journal. The APS has no restrictions on units at all. > Obviously for clarity I should put them in SI units when possible. > > Ralph > KM6IYN > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com > To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts > and follow the instructions there.
MD
Magnus Danielson
Mon, Nov 27, 2017 4:02 PM

Hi,

On 11/27/2017 04:05 PM, Bob kb8tq wrote:

Hi

On Nov 27, 2017, at 12:33 AM, Ralph Devoe rgdevoe@gmail.com wrote:

Here's a short reply to the comments of Bob, Attila, Magnus, and others.
Thanks for reading the paper carefully. I appreciate it. Some of the
comments are quite interesting, other seem off the mark. Let's start with
an interesting one:

The issue I intended to raise, but which I'm not sure I stated clearly
enough, is a conjecture: Is least-square fitting as efficient as any of the
other direct-digital or SDR techniques? Is the resolution of any
direct-digital system limited by (a) the effective number of bits of the
ADC and (b) the number of samples averaged? Thanks to Attila for reminding
me of the Sherman and Joerdens paper, which I have not read carefully
before. In their appendix Eq. A6 they derive a result which may or may not
be related to Eq. 6 in my paper. If the conjecture is true then the SDR
technique must be viewed as one several equivalent algorithms for
estimating phase. Note that the time deviation for a single ADC channel in
the Sherman and Joerdens paper in Fig. 3c is about the same as my value.
This suggests that the conjecture is true.

Other criticisms seem off the mark:

Several people raised the question of the filter factor of the least-square
fit.  First, if there is a filtering bias due to the fit, it would be the
same for signal and reference channels and should cancel.

Errr … no.

There are earlier posts about this on the list. The objective of ADEV is to capture
noise. Any filtering process rejects noise. That is true in DMTD and all the other approaches.
Presentations made in papers since the 1970’s demonstrate that it very much does
not cancel out or drop out. It impacts the number you get for ADEV. You have thrown away
part of what you set out to measure.

It's obvious already in David Allan's 1966 paper.
It's been verified and "re-discovered" a number of times.

You should re-read what I wrote, as it gives you the basic hints you
should be listening to.

Yes, ADEV is a bit fussy in this regard. Many of the other “DEV” measurements are also
fussy. This is at the heart of why many counters (when they estimate frequency) can not
be used directly for ADEV. Any technique that is proposed for ADEV needs to be analyzed.

For me it's not fuzzy, or rather, the things I know about these and
their coloring is one thing and the things I think is fuzzy is the stuff
I haven't published articles on yet.

The point here is not that filtering makes the measurement invalid. The point is that the
filter’s impact needs to be evaluated and stated. That is the key part of the proposed technique
that is missing at this point.

The traditional analysis is that the bandwidth derives from the nyquist
frequency of sampling, as expressed in David own words when I discussed
it last year "We had to, since that was the counters we had".

Staffan Johansson of Philips/Fluke/Pendulum wrote a paper on using
linear regression, which is just another name for least square fit,
frequency estimation and it's use in ADEV measurements.

Now, Prof. Enrico Rubiola realized that something was fishy, and it
indeed is, as the pre-filtering with fixed tau that linear regression /
least square achieves colors the low-tau measurements, but not the
high-tau measurements. This is because the frequency sensitivity of high
tau ADEVs becomes so completely within the passband of the pre-filter
that it does not care, but for low tau the prefiltering dominates and
produces lower values than it should, a biasing effect.

He also realized that the dynamic filter of MDEV, where the filter
changes with tau, would be interesting and that is how he came about to
come up with the parabolic deviation PDEV.

Now, the old wisdom is that you need to publish the bandwidth of the
pre-filtering of the channel, or else the noise estimation will not be
proper.

Look at the Allan Deviation Wikipedia article for a first discussion on
bias functions, they are all aspects of biasing of various forms of
processing.

The lesson to be learned here is that there is a number of different
ways that you can bias your measurements such that your ADEV values will
no longer be "valid" to correctly performed ADEV, and thus the ability
to compare them to judge levels of noise and goodness-values is being lost.

I know it is a bit much to take in at first, but trust me that this is
important stuff. So be careful about wielding "of the mark", this is the
stuff that you need to be careful about that we kindly try to advice you
on, and you should take the lesson when it's free.

Cheers,
Magnus

Bob

Second, even if
there is a bias, it would have to fluctuate from second to second to cause
a frequency error. Third, the Monte Carlo results show no bias. The output
of the Monte Carlo system is the difference between the fit result and the
known MC input. Any fitting bias would show up in the difference, but there
is none.

Attila says that I exaggerate the difficulty of programming an FPGA. Not
so. At work we give experts 1-6 months for a new FPGA design. We recently
ported some code from a Spartan 3 to a Spartan 6. Months of debugging
followed. FPGA's will always be faster and more computationally efficient
than Python, but Python is fast enough. The motivation for this experiment
was to use a high-level language (Python) and preexisting firmware and
software (Digilent) so that the device could be set up and reconfigured
easily, leaving more time to think about the important issues.

Attila has about a dozen criticisms of the theory section, mostly that it
is not rigorous enough and there are many assumptions. But it is not
intended to be rigorous. This is primarily an experimental paper and the
purpose of the theory is to give a simple physical picture of the
surprizingly good results. It does that, and the experimental results
support the conjecture above.
The limitations of the theory are discussed in detail on p. 6 where it is
called "... a convenient approximation.." Despite this the theory agrees
with the Monte Carlo over most of parameter space, and where it does not is
discussed in the text.

About units: I'm a physicist and normally use c.g.s units for
electromagnetic calculations. The paper was submitted to Rev. Sci. Instr.
which is an APS journal. The APS has no restrictions on units at all.
Obviously for clarity I should put them in SI units when possible.

Ralph
KM6IYN


time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.

Hi, On 11/27/2017 04:05 PM, Bob kb8tq wrote: > Hi > >> On Nov 27, 2017, at 12:33 AM, Ralph Devoe <rgdevoe@gmail.com> wrote: >> >> Here's a short reply to the comments of Bob, Attila, Magnus, and others. >> Thanks for reading the paper carefully. I appreciate it. Some of the >> comments are quite interesting, other seem off the mark. Let's start with >> an interesting one: >> >> The issue I intended to raise, but which I'm not sure I stated clearly >> enough, is a conjecture: Is least-square fitting as efficient as any of the >> other direct-digital or SDR techniques? Is the resolution of any >> direct-digital system limited by (a) the effective number of bits of the >> ADC and (b) the number of samples averaged? Thanks to Attila for reminding >> me of the Sherman and Joerdens paper, which I have not read carefully >> before. In their appendix Eq. A6 they derive a result which may or may not >> be related to Eq. 6 in my paper. If the conjecture is true then the SDR >> technique must be viewed as one several equivalent algorithms for >> estimating phase. Note that the time deviation for a single ADC channel in >> the Sherman and Joerdens paper in Fig. 3c is about the same as my value. >> This suggests that the conjecture is true. >> >> Other criticisms seem off the mark: >> >> Several people raised the question of the filter factor of the least-square >> fit. First, if there is a filtering bias due to the fit, it would be the >> same for signal and reference channels and should cancel. > > Errr … no. > > There are earlier posts about this on the list. The *objective* of ADEV is to capture > noise. Any filtering process rejects noise. That is true in DMTD and all the other approaches. > Presentations made in papers since the 1970’s demonstrate that it very much does > not cancel out or drop out. It impacts the number you get for ADEV. You have thrown away > part of what you set out to measure. It's obvious already in David Allan's 1966 paper. It's been verified and "re-discovered" a number of times. You should re-read what I wrote, as it gives you the basic hints you should be listening to. > Yes, ADEV is a bit fussy in this regard. Many of the other “DEV” measurements are also > fussy. This is at the heart of why many counters (when they estimate frequency) can not > be used directly for ADEV. Any technique that is proposed for ADEV needs to be analyzed. For me it's not fuzzy, or rather, the things I know about these and their coloring is one thing and the things I think is fuzzy is the stuff I haven't published articles on yet. > The point here is not that filtering makes the measurement invalid. The point is that the > filter’s impact needs to be evaluated and stated. That is the key part of the proposed technique > that is missing at this point. The traditional analysis is that the bandwidth derives from the nyquist frequency of sampling, as expressed in David own words when I discussed it last year "We had to, since that was the counters we had". Staffan Johansson of Philips/Fluke/Pendulum wrote a paper on using linear regression, which is just another name for least square fit, frequency estimation and it's use in ADEV measurements. Now, Prof. Enrico Rubiola realized that something was fishy, and it indeed is, as the pre-filtering with fixed tau that linear regression / least square achieves colors the low-tau measurements, but not the high-tau measurements. This is because the frequency sensitivity of high tau ADEVs becomes so completely within the passband of the pre-filter that it does not care, but for low tau the prefiltering dominates and produces lower values than it should, a biasing effect. He also realized that the dynamic filter of MDEV, where the filter changes with tau, would be interesting and that is how he came about to come up with the parabolic deviation PDEV. Now, the old wisdom is that you need to publish the bandwidth of the pre-filtering of the channel, or else the noise estimation will not be proper. Look at the Allan Deviation Wikipedia article for a first discussion on bias functions, they are all aspects of biasing of various forms of processing. The lesson to be learned here is that there is a number of different ways that you can bias your measurements such that your ADEV values will no longer be "valid" to correctly performed ADEV, and thus the ability to compare them to judge levels of noise and goodness-values is being lost. I know it is a bit much to take in at first, but trust me that this is important stuff. So be careful about wielding "of the mark", this is the stuff that you need to be careful about that we kindly try to advice you on, and you should take the lesson when it's free. Cheers, Magnus > > Bob > > >> Second, even if >> there is a bias, it would have to fluctuate from second to second to cause >> a frequency error. Third, the Monte Carlo results show no bias. The output >> of the Monte Carlo system is the difference between the fit result and the >> known MC input. Any fitting bias would show up in the difference, but there >> is none. >> >> Attila says that I exaggerate the difficulty of programming an FPGA. Not >> so. At work we give experts 1-6 months for a new FPGA design. We recently >> ported some code from a Spartan 3 to a Spartan 6. Months of debugging >> followed. FPGA's will always be faster and more computationally efficient >> than Python, but Python is fast enough. The motivation for this experiment >> was to use a high-level language (Python) and preexisting firmware and >> software (Digilent) so that the device could be set up and reconfigured >> easily, leaving more time to think about the important issues. >> >> Attila has about a dozen criticisms of the theory section, mostly that it >> is not rigorous enough and there are many assumptions. But it is not >> intended to be rigorous. This is primarily an experimental paper and the >> purpose of the theory is to give a simple physical picture of the >> surprizingly good results. It does that, and the experimental results >> support the conjecture above. >> The limitations of the theory are discussed in detail on p. 6 where it is >> called "... a convenient approximation.." Despite this the theory agrees >> with the Monte Carlo over most of parameter space, and where it does not is >> discussed in the text. >> >> About units: I'm a physicist and normally use c.g.s units for >> electromagnetic calculations. The paper was submitted to Rev. Sci. Instr. >> which is an APS journal. The APS has no restrictions on units at all. >> Obviously for clarity I should put them in SI units when possible. >> >> Ralph >> KM6IYN >> _______________________________________________ >> time-nuts mailing list -- time-nuts@febo.com >> To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts >> and follow the instructions there. > > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com > To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts > and follow the instructions there. >
AK
Attila Kinali
Mon, Nov 27, 2017 6:37 PM

Moin Ralph,

On Sun, 26 Nov 2017 21:33:03 -0800
Ralph Devoe rgdevoe@gmail.com wrote:

The issue I intended to raise, but which I'm not sure I stated clearly
enough, is a conjecture: Is least-square fitting as efficient as any of the
other direct-digital or SDR techniques?

You stated that, yes, but it's well hidden in the paper.

Is the resolution of any
direct-digital system limited by (a) the effective number of bits of the
ADC and (b) the number of samples averaged? Thanks to Attila for reminding
me of the Sherman and Joerdens paper, which I have not read carefully
before. In their appendix Eq. A6 they derive a result which may or may not
be related to Eq. 6 in my paper.

They are related, but only accidentally. S&J derive a lower bound for the
Allan variance from the SNR. You try to derive the lower bound
for the Allan variance from the quantization noise. That you end up
with similar looking formulas comes from the fact that both methods
have a scaling in 1/sqrt(X) where X is the number of samples taken.
though S&J use the number of phase estimates, while you use the
number of ADC samples. While related, they are not the same.
And you both have a scaling of 1/(2pif) to get from phase to time.
You will notice that your formla contains a 2^N term, with N being
the number of bits, but which you derive from the SNR (ENOB).
It's easy to show that the SNR due to quantization noise
is proportional to size of an LSB, ie. SNR ~ 2^N. If we now put in
all variables and substitute 2^N by SNR will see:

S&J: sigma >= 1/(2pif) * sqrt(2/(SNRN_sample))  (note the inequality!)
Yours: sigma ~= 1/(2
pi*f) * 1/SNR * sqrt(1/M)      (up to a constant)

Note three differences:

  1. S&J scales with 1/sqrt(SNR) while yours scales with 1/SNR
  2. S&J have a tau depndence implicit in the formula due to N_sample, you do not.
  3. S&J is a lower bound, yours an approximation (or claims to be).

If the conjecture is true then the SDR
technique must be viewed as one several equivalent algorithms for
estimating phase. Note that the time deviation for a single ADC channel in
the Sherman and Joerdens paper in Fig. 3c is about the same as my value.
This suggests that the conjecture is true.

Yes, you get to similar values, if you extrapolate from the TDEV
data in S&J Fig3c down to 40µs that you used. BUT: while S&J see
a decrease of the TDEV consistend with white phase noise until they
hit the flicker phase noise floor at about a tau of 1ms, your data
does not show such a decrease (or at least I didn't see it).

Other criticisms seem off the mark:

Several people raised the question of the filter factor of the least-square
fit.  First, if there is a filtering bias due to the fit, it would be the
same for signal and reference channels and should cancel. Second, even if
there is a bias, it would have to fluctuate from second to second to cause
a frequency error.

Bob answered that already, and I am pretty sure that Magnus will comment
on it as well. Both are better suited than me to go into the details of this.

Third, the Monte Carlo results show no bias. The output
of the Monte Carlo system is the difference between the fit result and the
known MC input. Any fitting bias would show up in the difference, but there
is none.

Sorry, but this is simply not the case. If I undestood your simulations
correctly (you give very little information about them), you used additive
Gaussian i.i.d noise on top of the signal. Of course, if you add Gaussian
i.i.d noise with zero mean, you will get zero bias in a linear least squares
fit. But, as Magnus and I have tried to tell you, noises we see in this area
are not necessarily Gauss i.i.d. Only white phase noise is Gauss i.i.d.
Most of the techniques we use in statistics implicitly assume Gauss i.i.d.
To show you that things fail in quite interesting way assume this:

X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = const)
Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f
Two time points: t_0 and t, where t > t_0

Then:

E[X(t) | X(t_0)] = 0
E[Y(t) | Y(t_0)] = Y(t_0)

Ie. the expectation of X will be zero, no matter whether you know any sample
of the random variable. But for Y, the expectation is biased to the last
sample you have seen, ie it is NOT zero for anything where t>0.
A consequence of this is, that if you take a number of samples, the average
will not approach zero for the limit of the number of samples going to infinity.
(For details see the theory of fractional Brownian motion, especially
the papers by Mandelbrot and his colleagues)

A PSD ~ 1/f is flicker phase noise, which usually starts to be relevant
in our systems for sampling times between 1µs (for high frequency stuff)
and 1-100s (high stability oscillators and atomic clocks). Unfortunately,
the Allan deviation does not discern between white phase noise and
flicker phase noise, so it's not possible to see in your plots where
flicker noise becomes relevant (that's why we have MDEV).

Or TL;DR: for measurements in our field, you cannot assume that the
noise you have is uncorrlated or has zero mean. If you do simulations,
you have to account for that properly (which is not an easy task).
Thus you can also not assume that your estimator has no bias, because
the underlying assumptions in this deduction is that the noise has
zero mean and averages out if you take a large number of samples.

And we have not yet touched the topic of higher order noises, with a PSD
that's proportional to 1/f^a with a>1.

Attila says that I exaggerate the difficulty of programming an FPGA. Not
so. At work we give experts 1-6 months for a new FPGA design. We recently
ported some code from a Spartan 3 to a Spartan 6. Months of debugging
followed.

This argument means that either your design was very complex, or it
used features of the Spartan3 that are not present in Spartan6 anymore.
The argument does not say anything about the difficulty of writing
a down-mixer and sub-sampling code (which takes less than a month,
including all validation, if you have no prior experience in signal
processing). Yes, it's still more complicated than calling a python
function. But if you'd have to write that python function yourself
(to make the comparison fair), then it would take you considerably
longer to make sure the fitting function worked correctly.
Using python for the curve fitting is like you would get the VHDL code
for the whole signal processing part from someone. That's easy to handle.
Done in an afternoon. At most.

And just to further underline my point here: I have both written
VHDL code for FPGAs to down mix and subsample and done sine fitting
using the very same python function you have used. I know the complexity
of both. As I know their pitfalls.

FPGA's will always be faster and more computationally efficient
than Python, but Python is fast enough. The motivation for this experiment
was to use a high-level language (Python) and preexisting firmware and
software (Digilent) so that the device could be set up and reconfigured
easily, leaving more time to think about the important issues.

Sure. This is fine. But please do not bash other techniques, just because
you are not good at handling them. Especially if you hide the complexity
of your approach completely in a sidermark. (Yes, that ticked me off)

Attila has about a dozen criticisms of the theory section, mostly that it
is not rigorous enough and there are many assumptions. But it is not
intended to be rigorous.

If it is not intended as such, then you should make it clear in
the paper. Or put it in an appendix. Currently the theory is almost 4 of
the 8 pages of your paper. So it looks like an important part.
And you still should make sure it is correct. Which currently it isn't.

This is primarily an experimental paper and the
purpose of the theory is to give a simple physical picture of the
surprizingly good results. It does that, and the experimental results
support the conjecture above.

Then you should have gone with a simple SNR based formula like S&J
or referenced one of the many papers out there that do this kind of
calculations and just repeated the formula with a comment that the
derivation is in paper X.

The limitations of the theory are discussed in detail on p. 6 where it is
called "... a convenient approximation.." Despite this the theory agrees
with the Monte Carlo over most of parameter space, and where it does not is
discussed in the text.

Please! This is bad science! You build a theory on flawed foundations,
use this theory as a foundation in your simulations. And when the
simulations agree with your theory you claim the theory is correct?
Please, do not do this!

Yes, it is ok to approximate. Yes it is ok to make assumptions.
But please be aware what the limits of those approximations and
assumptions are. I have tried to point out the flaws in your
argumentation and how they affect the validity of your paper.
If you just want to do an experimental paper, then the right
thing to do would be to cut out all the theory and concentrate
on the experiments.

I hope it comes accross that I do not critisize your experiments
or the results you got out of them. I critisize the analysis you
have done and that it contains assumptions, which you are not aware
of, that invalidate some of your results. The experiments are fine.
The precision you get is fine. But your analysis is flawed.

		Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

Moin Ralph, On Sun, 26 Nov 2017 21:33:03 -0800 Ralph Devoe <rgdevoe@gmail.com> wrote: > The issue I intended to raise, but which I'm not sure I stated clearly > enough, is a conjecture: Is least-square fitting as efficient as any of the > other direct-digital or SDR techniques? You stated that, yes, but it's well hidden in the paper. > Is the resolution of any > direct-digital system limited by (a) the effective number of bits of the > ADC and (b) the number of samples averaged? Thanks to Attila for reminding > me of the Sherman and Joerdens paper, which I have not read carefully > before. In their appendix Eq. A6 they derive a result which may or may not > be related to Eq. 6 in my paper. They are related, but only accidentally. S&J derive a lower bound for the Allan variance from the SNR. You try to derive the lower bound for the Allan variance from the quantization noise. That you end up with similar looking formulas comes from the fact that both methods have a scaling in 1/sqrt(X) where X is the number of samples taken. though S&J use the number of phase estimates, while you use the number of ADC samples. While related, they are not the same. And you both have a scaling of 1/(2*pi*f) to get from phase to time. You will notice that your formla contains a 2^N term, with N being the number of bits, but which you derive from the SNR (ENOB). It's easy to show that the SNR due to quantization noise is proportional to size of an LSB, ie. SNR ~ 2^N. If we now put in all variables and substitute 2^N by SNR will see: S&J: sigma >= 1/(2*pi*f) * sqrt(2/(SNR*N_sample)) (note the inequality!) Yours: sigma ~= 1/(2*pi*f) * 1/SNR * sqrt(1/M) (up to a constant) Note three differences: 1) S&J scales with 1/sqrt(SNR) while yours scales with 1/SNR 2) S&J have a tau depndence implicit in the formula due to N_sample, you do not. 3) S&J is a lower bound, yours an approximation (or claims to be). > If the conjecture is true then the SDR > technique must be viewed as one several equivalent algorithms for > estimating phase. Note that the time deviation for a single ADC channel in > the Sherman and Joerdens paper in Fig. 3c is about the same as my value. > This suggests that the conjecture is true. Yes, you get to similar values, if you extrapolate from the TDEV data in S&J Fig3c down to 40µs that you used. BUT: while S&J see a decrease of the TDEV consistend with white phase noise until they hit the flicker phase noise floor at about a tau of 1ms, your data does not show such a decrease (or at least I didn't see it). > Other criticisms seem off the mark: > > Several people raised the question of the filter factor of the least-square > fit. First, if there is a filtering bias due to the fit, it would be the > same for signal and reference channels and should cancel. Second, even if > there is a bias, it would have to fluctuate from second to second to cause > a frequency error. Bob answered that already, and I am pretty sure that Magnus will comment on it as well. Both are better suited than me to go into the details of this. > Third, the Monte Carlo results show no bias. The output > of the Monte Carlo system is the difference between the fit result and the > known MC input. Any fitting bias would show up in the difference, but there > is none. Sorry, but this is simply not the case. If I undestood your simulations correctly (you give very little information about them), you used additive Gaussian i.i.d noise on top of the signal. Of course, if you add Gaussian i.i.d noise with zero mean, you will get zero bias in a linear least squares fit. But, as Magnus and I have tried to tell you, noises we see in this area are not necessarily Gauss i.i.d. Only white phase noise is Gauss i.i.d. Most of the techniques we use in statistics implicitly assume Gauss i.i.d. To show you that things fail in quite interesting way assume this: X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = const) Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f Two time points: t_0 and t, where t > t_0 Then: E[X(t) | X(t_0)] = 0 E[Y(t) | Y(t_0)] = Y(t_0) Ie. the expectation of X will be zero, no matter whether you know any sample of the random variable. But for Y, the expectation is biased to the last sample you have seen, ie it is NOT zero for anything where t>0. A consequence of this is, that if you take a number of samples, the average will not approach zero for the limit of the number of samples going to infinity. (For details see the theory of fractional Brownian motion, especially the papers by Mandelbrot and his colleagues) A PSD ~ 1/f is flicker phase noise, which usually starts to be relevant in our systems for sampling times between 1µs (for high frequency stuff) and 1-100s (high stability oscillators and atomic clocks). Unfortunately, the Allan deviation does not discern between white phase noise and flicker phase noise, so it's not possible to see in your plots where flicker noise becomes relevant (that's why we have MDEV). Or TL;DR: for measurements in our field, you cannot assume that the noise you have is uncorrlated or has zero mean. If you do simulations, you have to account for that properly (which is not an easy task). Thus you can also not assume that your estimator has no bias, because the underlying assumptions in this deduction is that the noise has zero mean and averages out if you take a large number of samples. And we have not yet touched the topic of higher order noises, with a PSD that's proportional to 1/f^a with a>1. > Attila says that I exaggerate the difficulty of programming an FPGA. Not > so. At work we give experts 1-6 months for a new FPGA design. We recently > ported some code from a Spartan 3 to a Spartan 6. Months of debugging > followed. This argument means that either your design was very complex, or it used features of the Spartan3 that are not present in Spartan6 anymore. The argument does not say anything about the difficulty of writing a down-mixer and sub-sampling code (which takes less than a month, including all validation, if you have no prior experience in signal processing). Yes, it's still more complicated than calling a python function. But if you'd have to write that python function yourself (to make the comparison fair), then it would take you considerably longer to make sure the fitting function worked correctly. Using python for the curve fitting is like you would get the VHDL code for the whole signal processing part from someone. That's easy to handle. Done in an afternoon. At most. And just to further underline my point here: I have both written VHDL code for FPGAs to down mix and subsample and done sine fitting using the very same python function you have used. I know the complexity of both. As I know their pitfalls. > FPGA's will always be faster and more computationally efficient > than Python, but Python is fast enough. The motivation for this experiment > was to use a high-level language (Python) and preexisting firmware and > software (Digilent) so that the device could be set up and reconfigured > easily, leaving more time to think about the important issues. Sure. This is fine. But please do not bash other techniques, just because you are not good at handling them. Especially if you hide the complexity of your approach completely in a sidermark. (Yes, that ticked me off) > Attila has about a dozen criticisms of the theory section, mostly that it > is not rigorous enough and there are many assumptions. But it is not > intended to be rigorous. If it is not intended as such, then you should make it clear in the paper. Or put it in an appendix. Currently the theory is almost 4 of the 8 pages of your paper. So it looks like an important part. And you still should make sure it is correct. Which currently it isn't. > This is primarily an experimental paper and the > purpose of the theory is to give a simple physical picture of the > surprizingly good results. It does that, and the experimental results > support the conjecture above. Then you should have gone with a simple SNR based formula like S&J or referenced one of the many papers out there that do this kind of calculations and just repeated the formula with a comment that the derivation is in paper X. > The limitations of the theory are discussed in detail on p. 6 where it is > called "... a convenient approximation.." Despite this the theory agrees > with the Monte Carlo over most of parameter space, and where it does not is > discussed in the text. Please! This is bad science! You build a theory on flawed foundations, use this theory as a foundation in your simulations. And when the simulations agree with your theory you claim the theory is correct? Please, do not do this! Yes, it is ok to approximate. Yes it is ok to make assumptions. But please be aware what the limits of those approximations and assumptions are. I have tried to point out the flaws in your argumentation and how they affect the validity of your paper. If you just want to do an experimental paper, then the right thing to do would be to cut out all the theory and concentrate on the experiments. I hope it comes accross that I do not critisize your experiments or the results you got out of them. I critisize the analysis you have done and that it contains assumptions, which you are not aware of, that invalidate some of your results. The experiments are fine. The precision you get is fine. But your analysis is flawed. Attila Kinali -- It is upon moral qualities that a society is ultimately founded. All the prosperity and technological sophistication in the world is of no use without that foundation. -- Miss Matheson, The Diamond Age, Neil Stephenson
AK
Attila Kinali
Mon, Nov 27, 2017 9:50 PM

On Mon, 27 Nov 2017 19:37:11 +0100
Attila Kinali attila@kinali.ch wrote:

X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = const)
Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f
Two time points: t_0 and t, where t > t_0

Then:

E[X(t) | X(t_0)] = 0
E[Y(t) | Y(t_0)] = Y(t_0)

Ie. the expectation of X will be zero, no matter whether you know any sample
of the random variable. But for Y, the expectation is biased to the last
sample you have seen, ie it is NOT zero for anything where t>0.
A consequence of this is, that if you take a number of samples, the average
will not approach zero for the limit of the number of samples going to infinity.
(For details see the theory of fractional Brownian motion, especially
the papers by Mandelbrot and his colleagues)

To make the point a bit more clear. The above means that noise with
a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

Where 1) means there is a correlation between samples, ie if you know a
sample, you can predict what the next one will be. 2) means that the
properties of the random variable change over time. Note this is a
stronger non-stationary than the cyclostationarity that people in
signal theory and communication systems often assume, when they go
for non-stationary system characteristics. And 3) means that
if you take lots of samples from one random process, you will get a
different distribution than when you take lots of random processes
and take one sample each. Ergodicity is often implicitly assumed
in a lot of analysis, without people being aware of it. It is one
of the things that a lot of random processes in nature adhere to
and thus is ingrained in our understanding of the world. But noise
process in electronics, atomic clocks, fluid dynamics etc are not
ergodic in general.

As sidenote:

  1. holds true for a > 0 (ie anything but white noise).
    I am not yet sure when stationarity or ergodicity break, but my guess would
    be, that both break with a=1 (ie flicker noise). But that's only an assumption
    I have come to. I cannot prove or disprove this.

For 1 <= a < 3 (between flicker phase and flicker frequency, including flicker
phase, not including flicker frequency), the increments (ie the difference
between X(t) and X(t+1)) are stationary.

			Attila Kinali

--
May the bluebird of happiness twiddle your bits.

On Mon, 27 Nov 2017 19:37:11 +0100 Attila Kinali <attila@kinali.ch> wrote: > X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = const) > Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f > Two time points: t_0 and t, where t > t_0 > > Then: > > E[X(t) | X(t_0)] = 0 > E[Y(t) | Y(t_0)] = Y(t_0) > > Ie. the expectation of X will be zero, no matter whether you know any sample > of the random variable. But for Y, the expectation is biased to the last > sample you have seen, ie it is NOT zero for anything where t>0. > A consequence of this is, that if you take a number of samples, the average > will not approach zero for the limit of the number of samples going to infinity. > (For details see the theory of fractional Brownian motion, especially > the papers by Mandelbrot and his colleagues) To make the point a bit more clear. The above means that noise with a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency and flicker frequency noise), the noise (aka random variable) is: 1) Not independently distributed 2) Not stationary 3) Not ergodic Where 1) means there is a correlation between samples, ie if you know a sample, you can predict what the next one will be. 2) means that the properties of the random variable change over time. Note this is a stronger non-stationary than the cyclostationarity that people in signal theory and communication systems often assume, when they go for non-stationary system characteristics. And 3) means that if you take lots of samples from one random process, you will get a different distribution than when you take lots of random processes and take one sample each. Ergodicity is often implicitly assumed in a lot of analysis, without people being aware of it. It is one of the things that a lot of random processes in nature adhere to and thus is ingrained in our understanding of the world. But noise process in electronics, atomic clocks, fluid dynamics etc are not ergodic in general. As sidenote: 1) holds true for a > 0 (ie anything but white noise). I am not yet sure when stationarity or ergodicity break, but my guess would be, that both break with a=1 (ie flicker noise). But that's only an assumption I have come to. I cannot prove or disprove this. For 1 <= a < 3 (between flicker phase and flicker frequency, including flicker phase, not including flicker frequency), the increments (ie the difference between X(t) and X(t+1)) are stationary. Attila Kinali -- May the bluebird of happiness twiddle your bits.
MR
Mattia Rizzi
Mon, Nov 27, 2017 10:04 PM

Hi,

To make the point a bit more clear. The above means that noise with

a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

I think you got too much in theory. If you follow striclty the statistics
theory, you get nowhere.
You can't even talk about 1/f PSD, because Fourier doesn't converge over
infinite power signals.
In fact, you are not allowed to take a realization, make several fft and
claim that that's the PSD of the process. But that's what the spectrum
analyzer does, because it's not a multiverse instrument.
Every experimentalist suppose ergodicity on this kind of noise, otherwise
you get nowhere.

cheers,
Mattia

2017-11-27 22:50 GMT+01:00 Attila Kinali attila@kinali.ch:

On Mon, 27 Nov 2017 19:37:11 +0100
Attila Kinali attila@kinali.ch wrote:

X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD =

const)

Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f
Two time points: t_0 and t, where t > t_0

Then:

E[X(t) | X(t_0)] = 0
E[Y(t) | Y(t_0)] = Y(t_0)

Ie. the expectation of X will be zero, no matter whether you know any

sample

of the random variable. But for Y, the expectation is biased to the last
sample you have seen, ie it is NOT zero for anything where t>0.
A consequence of this is, that if you take a number of samples, the

average

will not approach zero for the limit of the number of samples going to

infinity.

(For details see the theory of fractional Brownian motion, especially
the papers by Mandelbrot and his colleagues)

To make the point a bit more clear. The above means that noise with
a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

Where 1) means there is a correlation between samples, ie if you know a
sample, you can predict what the next one will be. 2) means that the
properties of the random variable change over time. Note this is a
stronger non-stationary than the cyclostationarity that people in
signal theory and communication systems often assume, when they go
for non-stationary system characteristics. And 3) means that
if you take lots of samples from one random process, you will get a
different distribution than when you take lots of random processes
and take one sample each. Ergodicity is often implicitly assumed
in a lot of analysis, without people being aware of it. It is one
of the things that a lot of random processes in nature adhere to
and thus is ingrained in our understanding of the world. But noise
process in electronics, atomic clocks, fluid dynamics etc are not
ergodic in general.

As sidenote:

  1. holds true for a > 0 (ie anything but white noise).
    I am not yet sure when stationarity or ergodicity break, but my guess would
    be, that both break with a=1 (ie flicker noise). But that's only an
    assumption
    I have come to. I cannot prove or disprove this.

For 1 <= a < 3 (between flicker phase and flicker frequency, including
flicker
phase, not including flicker frequency), the increments (ie the difference
between X(t) and X(t+1)) are stationary.

                             Attila Kinali

--
May the bluebird of happiness twiddle your bits.


time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/
mailman/listinfo/time-nuts
and follow the instructions there.

Hi, >To make the point a bit more clear. The above means that noise with a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency and flicker frequency noise), the noise (aka random variable) is: 1) Not independently distributed 2) Not stationary 3) Not ergodic I think you got too much in theory. If you follow striclty the statistics theory, you get nowhere. You can't even talk about 1/f PSD, because Fourier doesn't converge over infinite power signals. In fact, you are not allowed to take a realization, make several fft and claim that that's the PSD of the process. But that's what the spectrum analyzer does, because it's not a multiverse instrument. Every experimentalist suppose ergodicity on this kind of noise, otherwise you get nowhere. cheers, Mattia 2017-11-27 22:50 GMT+01:00 Attila Kinali <attila@kinali.ch>: > On Mon, 27 Nov 2017 19:37:11 +0100 > Attila Kinali <attila@kinali.ch> wrote: > > > X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = > const) > > Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f > > Two time points: t_0 and t, where t > t_0 > > > > Then: > > > > E[X(t) | X(t_0)] = 0 > > E[Y(t) | Y(t_0)] = Y(t_0) > > > > Ie. the expectation of X will be zero, no matter whether you know any > sample > > of the random variable. But for Y, the expectation is biased to the last > > sample you have seen, ie it is NOT zero for anything where t>0. > > A consequence of this is, that if you take a number of samples, the > average > > will not approach zero for the limit of the number of samples going to > infinity. > > (For details see the theory of fractional Brownian motion, especially > > the papers by Mandelbrot and his colleagues) > > To make the point a bit more clear. The above means that noise with > a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency > and flicker frequency noise), the noise (aka random variable) is: > > 1) Not independently distributed > 2) Not stationary > 3) Not ergodic > > > Where 1) means there is a correlation between samples, ie if you know a > sample, you can predict what the next one will be. 2) means that the > properties of the random variable change over time. Note this is a > stronger non-stationary than the cyclostationarity that people in > signal theory and communication systems often assume, when they go > for non-stationary system characteristics. And 3) means that > if you take lots of samples from one random process, you will get a > different distribution than when you take lots of random processes > and take one sample each. Ergodicity is often implicitly assumed > in a lot of analysis, without people being aware of it. It is one > of the things that a lot of random processes in nature adhere to > and thus is ingrained in our understanding of the world. But noise > process in electronics, atomic clocks, fluid dynamics etc are not > ergodic in general. > > As sidenote: > > 1) holds true for a > 0 (ie anything but white noise). > I am not yet sure when stationarity or ergodicity break, but my guess would > be, that both break with a=1 (ie flicker noise). But that's only an > assumption > I have come to. I cannot prove or disprove this. > > For 1 <= a < 3 (between flicker phase and flicker frequency, including > flicker > phase, not including flicker frequency), the increments (ie the difference > between X(t) and X(t+1)) are stationary. > > Attila Kinali > > > -- > May the bluebird of happiness twiddle your bits. > > _______________________________________________ > time-nuts mailing list -- time-nuts@febo.com > To unsubscribe, go to https://www.febo.com/cgi-bin/ > mailman/listinfo/time-nuts > and follow the instructions there. >
MR
Mattia Rizzi
Mon, Nov 27, 2017 10:08 PM

I'm talking about flicker noise processes

2017-11-27 23:04 GMT+01:00 Mattia Rizzi mattia.rizzi@gmail.com:

Hi,

To make the point a bit more clear. The above means that noise with

a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

I think you got too much in theory. If you follow striclty the statistics
theory, you get nowhere.
You can't even talk about 1/f PSD, because Fourier doesn't converge over
infinite power signals.
In fact, you are not allowed to take a realization, make several fft and
claim that that's the PSD of the process. But that's what the spectrum
analyzer does, because it's not a multiverse instrument.
Every experimentalist suppose ergodicity on this kind of noise, otherwise
you get nowhere.

cheers,
Mattia

2017-11-27 22:50 GMT+01:00 Attila Kinali attila@kinali.ch:

On Mon, 27 Nov 2017 19:37:11 +0100
Attila Kinali attila@kinali.ch wrote:

X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD =

const)

Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f
Two time points: t_0 and t, where t > t_0

Then:

E[X(t) | X(t_0)] = 0
E[Y(t) | Y(t_0)] = Y(t_0)

Ie. the expectation of X will be zero, no matter whether you know any

sample

of the random variable. But for Y, the expectation is biased to the last
sample you have seen, ie it is NOT zero for anything where t>0.
A consequence of this is, that if you take a number of samples, the

average

will not approach zero for the limit of the number of samples going to

infinity.

(For details see the theory of fractional Brownian motion, especially
the papers by Mandelbrot and his colleagues)

To make the point a bit more clear. The above means that noise with
a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

Where 1) means there is a correlation between samples, ie if you know a
sample, you can predict what the next one will be. 2) means that the
properties of the random variable change over time. Note this is a
stronger non-stationary than the cyclostationarity that people in
signal theory and communication systems often assume, when they go
for non-stationary system characteristics. And 3) means that
if you take lots of samples from one random process, you will get a
different distribution than when you take lots of random processes
and take one sample each. Ergodicity is often implicitly assumed
in a lot of analysis, without people being aware of it. It is one
of the things that a lot of random processes in nature adhere to
and thus is ingrained in our understanding of the world. But noise
process in electronics, atomic clocks, fluid dynamics etc are not
ergodic in general.

As sidenote:

  1. holds true for a > 0 (ie anything but white noise).
    I am not yet sure when stationarity or ergodicity break, but my guess
    would
    be, that both break with a=1 (ie flicker noise). But that's only an
    assumption
    I have come to. I cannot prove or disprove this.

For 1 <= a < 3 (between flicker phase and flicker frequency, including
flicker
phase, not including flicker frequency), the increments (ie the difference
between X(t) and X(t+1)) are stationary.

                             Attila Kinali

--
May the bluebird of happiness twiddle your bits.


time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/m
ailman/listinfo/time-nuts
and follow the instructions there.

I'm talking about flicker noise processes 2017-11-27 23:04 GMT+01:00 Mattia Rizzi <mattia.rizzi@gmail.com>: > Hi, > > >To make the point a bit more clear. The above means that noise with > a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency > and flicker frequency noise), the noise (aka random variable) is: > 1) Not independently distributed > 2) Not stationary > 3) Not ergodic > > I think you got too much in theory. If you follow striclty the statistics > theory, you get nowhere. > You can't even talk about 1/f PSD, because Fourier doesn't converge over > infinite power signals. > In fact, you are not allowed to take a realization, make several fft and > claim that that's the PSD of the process. But that's what the spectrum > analyzer does, because it's not a multiverse instrument. > Every experimentalist suppose ergodicity on this kind of noise, otherwise > you get nowhere. > > cheers, > Mattia > > 2017-11-27 22:50 GMT+01:00 Attila Kinali <attila@kinali.ch>: > >> On Mon, 27 Nov 2017 19:37:11 +0100 >> Attila Kinali <attila@kinali.ch> wrote: >> >> > X(t): Random variable, Gauss distributed, zero mean, i.i.d (ie PSD = >> const) >> > Y(t): Random variable, Gauss distributed, zero mean, PSD ~ 1/f >> > Two time points: t_0 and t, where t > t_0 >> > >> > Then: >> > >> > E[X(t) | X(t_0)] = 0 >> > E[Y(t) | Y(t_0)] = Y(t_0) >> > >> > Ie. the expectation of X will be zero, no matter whether you know any >> sample >> > of the random variable. But for Y, the expectation is biased to the last >> > sample you have seen, ie it is NOT zero for anything where t>0. >> > A consequence of this is, that if you take a number of samples, the >> average >> > will not approach zero for the limit of the number of samples going to >> infinity. >> > (For details see the theory of fractional Brownian motion, especially >> > the papers by Mandelbrot and his colleagues) >> >> To make the point a bit more clear. The above means that noise with >> a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency >> and flicker frequency noise), the noise (aka random variable) is: >> >> 1) Not independently distributed >> 2) Not stationary >> 3) Not ergodic >> >> >> Where 1) means there is a correlation between samples, ie if you know a >> sample, you can predict what the next one will be. 2) means that the >> properties of the random variable change over time. Note this is a >> stronger non-stationary than the cyclostationarity that people in >> signal theory and communication systems often assume, when they go >> for non-stationary system characteristics. And 3) means that >> if you take lots of samples from one random process, you will get a >> different distribution than when you take lots of random processes >> and take one sample each. Ergodicity is often implicitly assumed >> in a lot of analysis, without people being aware of it. It is one >> of the things that a lot of random processes in nature adhere to >> and thus is ingrained in our understanding of the world. But noise >> process in electronics, atomic clocks, fluid dynamics etc are not >> ergodic in general. >> >> As sidenote: >> >> 1) holds true for a > 0 (ie anything but white noise). >> I am not yet sure when stationarity or ergodicity break, but my guess >> would >> be, that both break with a=1 (ie flicker noise). But that's only an >> assumption >> I have come to. I cannot prove or disprove this. >> >> For 1 <= a < 3 (between flicker phase and flicker frequency, including >> flicker >> phase, not including flicker frequency), the increments (ie the difference >> between X(t) and X(t+1)) are stationary. >> >> Attila Kinali >> >> >> -- >> May the bluebird of happiness twiddle your bits. >> >> _______________________________________________ >> time-nuts mailing list -- time-nuts@febo.com >> To unsubscribe, go to https://www.febo.com/cgi-bin/m >> ailman/listinfo/time-nuts >> and follow the instructions there. >> > >
MD
Magnus Danielson
Mon, Nov 27, 2017 10:45 PM

Hi,

On 11/27/2017 07:37 PM, Attila Kinali wrote:

Moin Ralph,

On Sun, 26 Nov 2017 21:33:03 -0800
Ralph Devoe rgdevoe@gmail.com wrote:

The issue I intended to raise, but which I'm not sure I stated clearly
enough, is a conjecture: Is least-square fitting as efficient as any of the
other direct-digital or SDR techniques?

You stated that, yes, but it's well hidden in the paper.

Least-square fitting done right is very efficient.

A good comparison would illustrate that, but it is also expected. What
does differ is how well adapted different approaches is.

If the conjecture is true then the SDR
technique must be viewed as one several equivalent algorithms for
estimating phase. Note that the time deviation for a single ADC channel in
the Sherman and Joerdens paper in Fig. 3c is about the same as my value.
This suggests that the conjecture is true.

Yes, you get to similar values, if you extrapolate from the TDEV
data in S&J Fig3c down to 40µs that you used. BUT: while S&J see
a decrease of the TDEV consistend with white phase noise until they
hit the flicker phase noise floor at about a tau of 1ms, your data
does not show such a decrease (or at least I didn't see it).

There is a number of ways to do this.
There is even a number of ways that least square processing can be applied.

The trouble with least square estimators is that you do not maintain the
improvement for longer taus, and the paper PDEV estimator does not
either. That motivated me to develop a decimator method for phase,
frequency and PDEV that extends in post-processing, which I presented
last year.

Other criticisms seem off the mark:

Several people raised the question of the filter factor of the least-square
fit.  First, if there is a filtering bias due to the fit, it would be the
same for signal and reference channels and should cancel. Second, even if
there is a bias, it would have to fluctuate from second to second to cause
a frequency error.

Bob answered that already, and I am pretty sure that Magnus will comment
on it as well. Both are better suited than me to go into the details of this.

Yes, see my comment.

Least square estimator for phase and frequency applies a linear ramp
weighing on phase samples or parabolic curve weighing on frequency
samples. These filter, and the bandwidth of the filter depends on the
sample count and time between samples. As sample count increases, the
bandwidth goes way down.

Third, the Monte Carlo results show no bias. The output
of the Monte Carlo system is the difference between the fit result and the
known MC input. Any fitting bias would show up in the difference, but there
is none.

Sorry, but this is simply not the case. If I undestood your simulations
correctly (you give very little information about them), you used additive
Gaussian i.i.d noise on top of the signal. Of course, if you add Gaussian
i.i.d noise with zero mean, you will get zero bias in a linear least squares
fit. But, as Magnus and I have tried to tell you, noises we see in this area
are not necessarily Gauss i.i.d. Only white phase noise is Gauss i.i.d.
Most of the techniques we use in statistics implicitly assume Gauss i.i.d.

Go back to the IEEE Special issue on time and frequency from february
1966 you find a nice set of articles. In there is among others David
Allans article on 2-sample variance that later became Allans variance
and now Allan variance. Another article is the short but classic write
up of another youngster, David Leeson, which summarize a model for phase
noise generation which we today refer to the Leeson model. To deeper
appreciate the Leeson model, check out the phase-noise book of Enrico
Rubiola, which gives you some insight. If you want to make designs,
there is more to it, so several other papers needs to be read, but here
you just need to understand that you get 3 or 4 types of noises out of
an oscillator, and the trouble with them is that noise does not converge
like your normal textbook on statistics would make you assume. The RMS
estimator on your frequency estimation does not converge, in fact it
goes astray and vary with the amount of samples. This was already a
known problem, but the solution came with Dave Allans paper. It in fact
includes a function we later would refer to a bias function that depends
on the number of samples taken. This motivates the conversion from one M
sample variance to a 2-sample variance and a N sample variance to a
2-sample variance such that they can be compared. The bias function
varies with the number of samples and the dominant noise-form.

The noiseforms are strange and their action on statistics is strange.
You need to understand how they interact with your measurement tool, and
that well, in the end you need to test all noiseforms.

Attila says that I exaggerate the difficulty of programming an FPGA. Not
so. At work we give experts 1-6 months for a new FPGA design. We recently
ported some code from a Spartan 3 to a Spartan 6. Months of debugging
followed.

This argument means that either your design was very complex, or it
used features of the Spartan3 that are not present in Spartan6 anymore.
The argument does not say anything about the difficulty of writing
a down-mixer and sub-sampling code (which takes less than a month,
including all validation, if you have no prior experience in signal
processing). Yes, it's still more complicated than calling a python
function. But if you'd have to write that python function yourself
(to make the comparison fair), then it would take you considerably
longer to make sure the fitting function worked correctly.
Using python for the curve fitting is like you would get the VHDL code
for the whole signal processing part from someone. That's easy to handle.
Done in an afternoon. At most.

And just to further underline my point here: I have both written
VHDL code for FPGAs to down mix and subsample and done sine fitting
using the very same python function you have used. I know the complexity
of both. As I know their pitfalls.

Another sample point: One of my designs was converted to Virtex 7, we
lost a week not because of the design was broken, but library
dependencies where a bit old so synthesis got things incorrect. Once we
realized that, it was trivial to fix and it works. VHDL for FPGA has
good uses, and some will never really be competed by software. I do
both, as they have different benefits. There is god and bad ways of
designing things for FPGA or as software, it takes skills to design
things in a portable way. It also takes skills and time to port a design
to use most of a new chip or CPU.

FPGA's will always be faster and more computationally efficient
than Python, but Python is fast enough. The motivation for this experiment
was to use a high-level language (Python) and preexisting firmware and
software (Digilent) so that the device could be set up and reconfigured
easily, leaving more time to think about the important issues.

Sure. This is fine. But please do not bash other techniques, just because
you are not good at handling them. Especially if you hide the complexity
of your approach completely in a sidermark. (Yes, that ticked me off)

Indeed. This is not a good way to convince anyone. If you don't work
well with certain tools, either don't use them or learn to understand
how to use them. I tend to do the later so I learn.

The trick is that you want to use the benefit of both to achieve that
extreme performance, and when you do it right, you use their strengths
together without the actual design being very complex. That's the beauty
of good designs.

Attila has about a dozen criticisms of the theory section, mostly that it
is not rigorous enough and there are many assumptions. But it is not
intended to be rigorous.

If it is not intended as such, then you should make it clear in
the paper. Or put it in an appendix. Currently the theory is almost 4 of
the 8 pages of your paper. So it looks like an important part.
And you still should make sure it is correct. Which currently it isn't.

This is primarily an experimental paper and the
purpose of the theory is to give a simple physical picture of the
surprizingly good results. It does that, and the experimental results
support the conjecture above.

Then you should have gone with a simple SNR based formula like S&J
or referenced one of the many papers out there that do this kind of
calculations and just repeated the formula with a comment that the
derivation is in paper X.

Break it up and focus on what is important in separate papers.

The limitations of the theory are discussed in detail on p. 6 where it is
called "... a convenient approximation.." Despite this the theory agrees
with the Monte Carlo over most of parameter space, and where it does not is
discussed in the text.

Please! This is bad science! You build a theory on flawed foundations,
use this theory as a foundation in your simulations. And when the
simulations agree with your theory you claim the theory is correct?
Please, do not do this!

Yes, it is ok to approximate. Yes it is ok to make assumptions.
But please be aware what the limits of those approximations and
assumptions are. I have tried to point out the flaws in your
argumentation and how they affect the validity of your paper.
If you just want to do an experimental paper, then the right
thing to do would be to cut out all the theory and concentrate
on the experiments.

Agree fully.

Please remember that while Attila, Bob and I may be critical, we do try
to make you aware of relevant aspects you need to consider.

I have tried to hint that it would be useful to see how different
estimator methods perform. The type of dominant noise for a certain tau
is relevant, and is how we have been forced to analyze things for the
last 50 years because of the problems we try to indicate.

I hope it comes accross that I do not critisize your experiments
or the results you got out of them. I critisize the analysis you
have done and that it contains assumptions, which you are not aware
of, that invalidate some of your results. The experiments are fine.
The precision you get is fine. But your analysis is flawed.

We are doing the friendly pre-review. A real reviewer could easily just
say "Not worthy to publish", as I have seen.

There is nothing wrong about attempting new approaches, or even just
test and idea and see how it pans out. You should then compare it to a
number of other approaches, and as you test things, you should analyze
the same data with different methods. Prototyping that in Python is
fine, but in order to analyze it, you need to be careful about the details.

I would consider one just doing the measurements and then try different
post-processings and see how those vary.
Another paper then takes up on that and attempts analysis that matches
the numbers from actual measurements.

So, we might provide tough love, but there is a bit of experience behind
it, so it should be listened to carefully.

Cheers,
Magnus

Hi, On 11/27/2017 07:37 PM, Attila Kinali wrote: > Moin Ralph, > > On Sun, 26 Nov 2017 21:33:03 -0800 > Ralph Devoe <rgdevoe@gmail.com> wrote: > >> The issue I intended to raise, but which I'm not sure I stated clearly >> enough, is a conjecture: Is least-square fitting as efficient as any of the >> other direct-digital or SDR techniques? > > You stated that, yes, but it's well hidden in the paper. Least-square fitting done right is very efficient. A good comparison would illustrate that, but it is also expected. What does differ is how well adapted different approaches is. >> If the conjecture is true then the SDR >> technique must be viewed as one several equivalent algorithms for >> estimating phase. Note that the time deviation for a single ADC channel in >> the Sherman and Joerdens paper in Fig. 3c is about the same as my value. >> This suggests that the conjecture is true. > > Yes, you get to similar values, if you extrapolate from the TDEV > data in S&J Fig3c down to 40µs that you used. BUT: while S&J see > a decrease of the TDEV consistend with white phase noise until they > hit the flicker phase noise floor at about a tau of 1ms, your data > does not show such a decrease (or at least I didn't see it). There is a number of ways to do this. There is even a number of ways that least square processing can be applied. The trouble with least square estimators is that you do not maintain the improvement for longer taus, and the paper PDEV estimator does not either. That motivated me to develop a decimator method for phase, frequency and PDEV that extends in post-processing, which I presented last year. >> Other criticisms seem off the mark: >> >> Several people raised the question of the filter factor of the least-square >> fit. First, if there is a filtering bias due to the fit, it would be the >> same for signal and reference channels and should cancel. Second, even if >> there is a bias, it would have to fluctuate from second to second to cause >> a frequency error. > > Bob answered that already, and I am pretty sure that Magnus will comment > on it as well. Both are better suited than me to go into the details of this. Yes, see my comment. Least square estimator for phase and frequency applies a linear ramp weighing on phase samples or parabolic curve weighing on frequency samples. These filter, and the bandwidth of the filter depends on the sample count and time between samples. As sample count increases, the bandwidth goes way down. >> Third, the Monte Carlo results show no bias. The output >> of the Monte Carlo system is the difference between the fit result and the >> known MC input. Any fitting bias would show up in the difference, but there >> is none. > > Sorry, but this is simply not the case. If I undestood your simulations > correctly (you give very little information about them), you used additive > Gaussian i.i.d noise on top of the signal. Of course, if you add Gaussian > i.i.d noise with zero mean, you will get zero bias in a linear least squares > fit. But, as Magnus and I have tried to tell you, noises we see in this area > are not necessarily Gauss i.i.d. Only white phase noise is Gauss i.i.d. > Most of the techniques we use in statistics implicitly assume Gauss i.i.d. Go back to the IEEE Special issue on time and frequency from february 1966 you find a nice set of articles. In there is among others David Allans article on 2-sample variance that later became Allans variance and now Allan variance. Another article is the short but classic write up of another youngster, David Leeson, which summarize a model for phase noise generation which we today refer to the Leeson model. To deeper appreciate the Leeson model, check out the phase-noise book of Enrico Rubiola, which gives you some insight. If you want to make designs, there is more to it, so several other papers needs to be read, but here you just need to understand that you get 3 or 4 types of noises out of an oscillator, and the trouble with them is that noise does not converge like your normal textbook on statistics would make you assume. The RMS estimator on your frequency estimation does not converge, in fact it goes astray and vary with the amount of samples. This was already a known problem, but the solution came with Dave Allans paper. It in fact includes a function we later would refer to a bias function that depends on the number of samples taken. This motivates the conversion from one M sample variance to a 2-sample variance and a N sample variance to a 2-sample variance such that they can be compared. The bias function varies with the number of samples and the dominant noise-form. The noiseforms are strange and their action on statistics is strange. You need to understand how they interact with your measurement tool, and that well, in the end you need to test all noiseforms. >> Attila says that I exaggerate the difficulty of programming an FPGA. Not >> so. At work we give experts 1-6 months for a new FPGA design. We recently >> ported some code from a Spartan 3 to a Spartan 6. Months of debugging >> followed. > > This argument means that either your design was very complex, or it > used features of the Spartan3 that are not present in Spartan6 anymore. > The argument does not say anything about the difficulty of writing > a down-mixer and sub-sampling code (which takes less than a month, > including all validation, if you have no prior experience in signal > processing). Yes, it's still more complicated than calling a python > function. But if you'd have to write that python function yourself > (to make the comparison fair), then it would take you considerably > longer to make sure the fitting function worked correctly. > Using python for the curve fitting is like you would get the VHDL code > for the whole signal processing part from someone. That's easy to handle. > Done in an afternoon. At most. > > And just to further underline my point here: I have both written > VHDL code for FPGAs to down mix and subsample and done sine fitting > using the very same python function you have used. I know the complexity > of both. As I know their pitfalls. Another sample point: One of my designs was converted to Virtex 7, we lost a week not because of the design was broken, but library dependencies where a bit old so synthesis got things incorrect. Once we realized that, it was trivial to fix and it works. VHDL for FPGA has good uses, and some will never really be competed by software. I do both, as they have different benefits. There is god and bad ways of designing things for FPGA or as software, it takes skills to design things in a portable way. It also takes skills and time to port a design to use most of a new chip or CPU. >> FPGA's will always be faster and more computationally efficient >> than Python, but Python is fast enough. The motivation for this experiment >> was to use a high-level language (Python) and preexisting firmware and >> software (Digilent) so that the device could be set up and reconfigured >> easily, leaving more time to think about the important issues. > > Sure. This is fine. But please do not bash other techniques, just because > you are not good at handling them. Especially if you hide the complexity > of your approach completely in a sidermark. (Yes, that ticked me off) Indeed. This is not a good way to convince anyone. If you don't work well with certain tools, either don't use them or learn to understand how to use them. I tend to do the later so I learn. The trick is that you want to use the benefit of both to achieve that extreme performance, and when you do it right, you use their strengths together without the actual design being very complex. That's the beauty of good designs. >> Attila has about a dozen criticisms of the theory section, mostly that it >> is not rigorous enough and there are many assumptions. But it is not >> intended to be rigorous. > > If it is not intended as such, then you should make it clear in > the paper. Or put it in an appendix. Currently the theory is almost 4 of > the 8 pages of your paper. So it looks like an important part. > And you still should make sure it is correct. Which currently it isn't. > >> This is primarily an experimental paper and the >> purpose of the theory is to give a simple physical picture of the >> surprizingly good results. It does that, and the experimental results >> support the conjecture above. > > Then you should have gone with a simple SNR based formula like S&J > or referenced one of the many papers out there that do this kind of > calculations and just repeated the formula with a comment that the > derivation is in paper X. Break it up and focus on what is important in separate papers. >> The limitations of the theory are discussed in detail on p. 6 where it is >> called "... a convenient approximation.." Despite this the theory agrees >> with the Monte Carlo over most of parameter space, and where it does not is >> discussed in the text. > > Please! This is bad science! You build a theory on flawed foundations, > use this theory as a foundation in your simulations. And when the > simulations agree with your theory you claim the theory is correct? > Please, do not do this! > > Yes, it is ok to approximate. Yes it is ok to make assumptions. > But please be aware what the limits of those approximations and > assumptions are. I have tried to point out the flaws in your > argumentation and how they affect the validity of your paper. > If you just want to do an experimental paper, then the right > thing to do would be to cut out all the theory and concentrate > on the experiments. Agree fully. Please remember that while Attila, Bob and I may be critical, we do try to make you aware of relevant aspects you need to consider. I have tried to hint that it would be useful to see how different estimator methods perform. The type of dominant noise for a certain tau is relevant, and is how we have been forced to analyze things for the last 50 years because of the problems we try to indicate. > I hope it comes accross that I do not critisize your experiments > or the results you got out of them. I critisize the analysis you > have done and that it contains assumptions, which you are not aware > of, that invalidate some of your results. The experiments are fine. > The precision you get is fine. But your analysis is flawed. We are doing the friendly pre-review. A real reviewer could easily just say "Not worthy to publish", as I have seen. There is nothing wrong about attempting new approaches, or even just test and idea and see how it pans out. You should then compare it to a number of other approaches, and as you test things, you should analyze the same data with different methods. Prototyping that in Python is fine, but in order to analyze it, you need to be careful about the details. I would consider one just doing the measurements and then try different post-processings and see how those vary. Another paper then takes up on that and attempts analysis that matches the numbers from actual measurements. So, we might provide tough love, but there is a bit of experience behind it, so it should be listened to carefully. Cheers, Magnus
AK
Attila Kinali
Mon, Nov 27, 2017 10:50 PM

Hoi Mattia,

On Mon, 27 Nov 2017 23:04:56 +0100
Mattia Rizzi mattia.rizzi@gmail.com wrote:

To make the point a bit more clear. The above means that noise with
a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency
and flicker frequency noise), the noise (aka random variable) is:

  1. Not independently distributed
  2. Not stationary
  3. Not ergodic

I think you got too much in theory. If you follow striclty the statistics
theory, you get nowhere.
You can't even talk about 1/f PSD, because Fourier doesn't converge over
infinite power signals.

This is true. But then the Fourier transformation integrates time from
minus infinity to plus infinity. Which isn't exactly realistic either.
The power in 1/f noise is actually limited by the age of the universe.
And quite strictly so. The power you have in 1/f is the same for every
decade in frequency (or time) you go. The age of the universe is about
1e10 years, that's roughly 3e17 seconds, ie 17 decades of possible noise.
If we assume something like a 1k carbon resistor you get something around
of 1e-17W/decade of noise power (guestimate, not an exact calculation).
That means that resistor, had it been around ever since the universe was
created, then it would have converted 17*1e-17 = 2e-16W of heat into
electrical energy, on average, over the whole liftime of the universe.
That's not much :-)

In fact, you are not allowed to take a realization, make several fft and
claim that that's the PSD of the process. But that's what the spectrum
analyzer does, because it's not a multiverse instrument.

Well, any measurement is an estimate.

Every experimentalist suppose ergodicity on this kind of noise, otherwise
you get nowhere.

Err.. no. Even if you assume that the spectrum tops off at some very
low frequency and does not increase anymore, ie that there is a finite
limit to noise power, even then ergodicity is not given.
Ergodicity breaks because the noise process is not stationary.
And assuming so for any kind of 1/f noise would be wrong.

		Attila Kinali

--
<JaberWorky> The bad part of Zurich is where the degenerates
throw DARK chocolate at you.

Hoi Mattia, On Mon, 27 Nov 2017 23:04:56 +0100 Mattia Rizzi <mattia.rizzi@gmail.com> wrote: > >To make the point a bit more clear. The above means that noise with > > a PSD of the form 1/f^a for a>=1 (ie flicker phase, white frequency > > and flicker frequency noise), the noise (aka random variable) is: > > 1) Not independently distributed > > 2) Not stationary > > 3) Not ergodic > > I think you got too much in theory. If you follow striclty the statistics > theory, you get nowhere. > You can't even talk about 1/f PSD, because Fourier doesn't converge over > infinite power signals. This is true. But then the Fourier transformation integrates time from minus infinity to plus infinity. Which isn't exactly realistic either. The power in 1/f noise is actually limited by the age of the universe. And quite strictly so. The power you have in 1/f is the same for every decade in frequency (or time) you go. The age of the universe is about 1e10 years, that's roughly 3e17 seconds, ie 17 decades of possible noise. If we assume something like a 1k carbon resistor you get something around of 1e-17W/decade of noise power (guestimate, not an exact calculation). That means that resistor, had it been around ever since the universe was created, then it would have converted 17*1e-17 = 2e-16W of heat into electrical energy, on average, over the whole liftime of the universe. That's not much :-) > In fact, you are not allowed to take a realization, make several fft and > claim that that's the PSD of the process. But that's what the spectrum > analyzer does, because it's not a multiverse instrument. Well, any measurement is an estimate. > Every experimentalist suppose ergodicity on this kind of noise, otherwise > you get nowhere. Err.. no. Even if you assume that the spectrum tops off at some very low frequency and does not increase anymore, ie that there is a finite limit to noise power, even then ergodicity is not given. Ergodicity breaks because the noise process is not stationary. And assuming so for any kind of 1/f noise would be wrong. Attila Kinali -- <JaberWorky> The bad part of Zurich is where the degenerates throw DARK chocolate at you.
J
jimlux
Mon, Nov 27, 2017 11:03 PM

On 11/27/17 2:45 PM, Magnus Danielson wrote:

There is nothing wrong about attempting new approaches, or even just
test and idea and see how it pans out. You should then compare it to a
number of other approaches, and as you test things, you should analyze
the same data with different methods. Prototyping that in Python is
fine, but in order to analyze it, you need to be careful about the details.

I would consider one just doing the measurements and then try different
post-processings and see how those vary.
Another paper then takes up on that and attempts analysis that matches
the numbers from actual measurements.

So, we might provide tough love, but there is a bit of experience behind
it, so it should be listened to carefully.

It is tough to come up with good artificial test data - the literature
on generating "noise samples" is significantly thinner than the
literature on measuring the noise.

When it comes to measuring actual signals with actual ADCs, there's also
a number of traps - you can design a nice approach, using the SNR/ENOB
data from the data sheet, and get seemingly good data.

The challenge is really in coming up with good tests of your
measurement technique that show that it really is giving you what you
think it is.

A trivial example is this (not a noise measuring problem, per se) -

You need to measure the power of a received signal - if the signal is
narrow band, and high SNR, then the bandwidth of the measuring system
(be it a FFT or conventional spectrum analyzer) doesn't make a lot of
difference - the precise filter shape is non-critical.  The noise power
that winds up in the measurement bandwidth is small, for instance.

But now, let's say that the signal is a bit wider band or lower SNR or
you're uncertain of its exact frequency, then the shape of the filter
starts to make a big difference.

Now, let’s look at a system where there’s some decimation involved - any
decimation raises the prospect of “out of band signals” aliasing into
the post decimation passband.  Now, all of a sudden, the filtering
before the decimator starts to become more important. And the number of
bits you have to carry starts being more important.

It actually took a fair amount of work to prove that a system I was
working on
a) accurately measured the signal (in the presence of other large signals)
b) that there weren’t numerical issues causing the strong signal to show
up in the low level signal filter bins
c) that the measured noise floor matched the expectation

On 11/27/17 2:45 PM, Magnus Danielson wrote: > > There is nothing wrong about attempting new approaches, or even just > test and idea and see how it pans out. You should then compare it to a > number of other approaches, and as you test things, you should analyze > the same data with different methods. Prototyping that in Python is > fine, but in order to analyze it, you need to be careful about the details. > > I would consider one just doing the measurements and then try different > post-processings and see how those vary. > Another paper then takes up on that and attempts analysis that matches > the numbers from actual measurements. > > So, we might provide tough love, but there is a bit of experience behind > it, so it should be listened to carefully. > It is tough to come up with good artificial test data - the literature on generating "noise samples" is significantly thinner than the literature on measuring the noise. When it comes to measuring actual signals with actual ADCs, there's also a number of traps - you can design a nice approach, using the SNR/ENOB data from the data sheet, and get seemingly good data. The challenge is really in coming up with good *tests* of your measurement technique that show that it really is giving you what you think it is. A trivial example is this (not a noise measuring problem, per se) - You need to measure the power of a received signal - if the signal is narrow band, and high SNR, then the bandwidth of the measuring system (be it a FFT or conventional spectrum analyzer) doesn't make a lot of difference - the precise filter shape is non-critical. The noise power that winds up in the measurement bandwidth is small, for instance. But now, let's say that the signal is a bit wider band or lower SNR or you're uncertain of its exact frequency, then the shape of the filter starts to make a big difference. Now, let’s look at a system where there’s some decimation involved - any decimation raises the prospect of “out of band signals” aliasing into the post decimation passband. Now, all of a sudden, the filtering before the decimator starts to become more important. And the number of bits you have to carry starts being more important. It actually took a fair amount of work to *prove* that a system I was working on a) accurately measured the signal (in the presence of other large signals) b) that there weren’t numerical issues causing the strong signal to show up in the low level signal filter bins c) that the measured noise floor matched the expectation