[Message Prev][Message Next][Thread Prev][Thread Next][Message Index][Thread Index]
[vsnet-chat 3649] re Overobserving
- Date: Fri, 29 Sep 2000 16:56:24 +0000
- To: vsnet-chat@kusastro.kyoto-u.ac.jp
- From: crawl@zoom.co.uk
- Subject: [vsnet-chat 3649] re Overobserving
- Sender: owner-vsnet-chat@kusastro.kyoto-u.ac.jp
Taichi Kato wrote:
> Why not implelement the (known) correction algorithm(s) to the
> analysis package? My strategy was quite simple -- to add a
constant
> value for each observer, and minimize the high-frequency
power (expressed
> in other numerical form, however). This can be implemented by
using a
> simple linear least-squares equation, and can be fully
automatically done.
> Other algorithms, if exist, could be implemented likewise. This
> approach, however, reduces one degree of freedom from each
observers, so
> observers who made only one observation is automatically
given zero
> weight...
Okay, I'm going to have to answer that via three more or less specific
examples, which highlight [in my view] the point that it is only really
safe to think of [or even _hope_] that problems with raw data can be
considered as self cancelling, or at least incoherent non-signal creating,
noise. As soon as any level of pre-processing is done on the raw visual
obs, this assumption is compromised further. Some of the following uses
values from memory, so don't take them as rigorous periods etc!!!
1st, absolutely rubbish observations. Possibly it would be kinder to call
these "miscorrect" observations. I invariably give visual observation
datasets, no matter how massive the dataset, a quick look through with an
ascii reader and many, many presses of the "page down" key.
If you do this with quite a bit of AFOEV archived data you will find one
particular observer, who will remain fully unidentified here, who has a
strange habit. Little if any observing seems to be done for longish
periods, and then an LPV may be observed half a dozen times or more over a
night or two.
You may think this is the "overobserving" case again, probably cured by the
above linear regression route or some such. However, this observer is
_INVARIABLY_ quoting observations from half a magnitude up to two
magnitudes brighter than anybody else who observed on that night.
I just crop that observers observations out and throw them away nowadays.
Before I noticed this happened, they probably just got lost in the noise
when I did an analysis. Granted that correction factors for this
particular observer for each particular variable could be derived and
applied, but that's a special set of algorithms for one person, not a
global route as described above.
2nd, a year or so ago I was lucky enough to get hold of some data from
AAVSO, which was a good trick cos in those days getting hold of aavso data
was like pulling teeth, but less fun.
The 1961 to more or less present data is quoted by AAVSO as being
"validated". AAVSO observers ostensibly use AAVSO stars. Yet it was
possible to find two observers who observed within 0.01 day of each other
who had observations disparate by a whole magnitude.
Okay, let's say one of the observers was more prolific than the other, well
then, unless it can be rigidly proven that someone who observes a lot is
more likely to be more accurate, how would you choose between these two
observers? If more prolific did equal more accurate then, fair enough, the
more prolific observer's data would receive more weight. But the AFOEV
example above had a seriously "prolific" observer who always reports a lot
brighter than everybody else!!!!!
Of course, that'd be fixed if you _banned_ overobservers, extending the
guilt to all for the behaviour of a few, but these are decisions beyond
algorithms, and beyond straightforward analysis and global corrections
and/or global weightings etc.
3rd, the BAA VSS's turn. Before a now updated sequence came into force,
the BAAVSS data for UU Aurigae was seriously compromised due to comparison
star E being very blue and star F being very red [or was it the other way
round?], and the variable being quite red.
John Howarth and I analysed this beast amongst other stars in the BAAVSS
archives, as part of a general look-see, and I decided the data was
abysmal, and stubbornly refused to publish anything on it.
But I did check out what had happened. Fortunately 2 of the more prolific
observers of this star were experienced observers. One is head of a UK
variable star group, the other does the charts for the other UK variable
star group. And as luck would have it, one _invariably_ used stars E and F
in comparisons, whilst the other very rarely used these stars.
So, I analysed their data separately, and found that if people used star E
and/or F to compare UU Aur they barely saw it vary, whilst if they did not,
something akin to the real situation could be found. This had already been
more generally evident in the folded lightcurve, which showed a scattered
sinusoid superposed by thick horizontal "tramlines". The horizontal
tramlines invariably were the result of people using stars E &/or F to
estimate UU Aur... ...interestingly these themselves clustered at separate
values a few tenths of a magnitude apart dependant on how particular sets
of observers' eyes worked!
Interestingly there are two sidepoints here. First, the BAAVSS practice of
recording the actual _estimate_ as well as the reduced observation allowed
me to reverse engineer this problem, and second the farce probably carried
on as long as it did because people were probably _not_ looking at their
previous observations in order to avoid bias!!!! Everybody is warned off
about this bias effect, but if just a few people had thought to look at
their own observations they'd have thought "hey, this star is constant",
noted the point to someone, investigations would have ensued and it would
have been found not to be constant, and the problem would have been solved
earlier.
There's too many dogmatic "mustn't do this" statements in visual variable
star observing that would be a good project for some group out there to
double check... ...you'd need enough folk to be able to split into a two
teams, one being a control group.
Anyway, the point of the UU Aur ramblings? Well, the BAAVSS data on it is
seriously compromised because of this, yet if you do "simple" DFT on it you
can still readily recover the 400+ and 200+ day periods of this doubly
periodic semiregular from the data, with results very little different
period-wise than if you analyse the independent AFOEV data for this star
over the same time interval, the AFOEV data being problem free. The
amplitude of the peaks is of course seriously reduced in the result from
the BAAVSS data, but they're still readily distinguished.
In other words, if you just accept that every non-valid observation of a
variable star is only going to add to the _noise_, without creating any
spurious periodicities, then even with very bad data for a variable, you
can recover its period[s]. [Over and above other problems like annual
"pseudo-aliasing", and even some good old fashioned beat aliasing for some
low amplitude objects].
John Howarth, who has serious pieces of paper for maths and statistics from
_top_ UK universities, went on to correct this data on his own using
methodologies and routines beyond my understanding, and presented a paper
to the BAA which'll get in the JBAA eventually. The results he got weren't
much different than the ones I got without cleaning up the data, and
despite weightings and corrections and slights of hand, still weren't as
good as those gleaned from the AFOEV data. Yet all these datasets/routes
presented periods within a day of each other.
Of course, for very low amplitude objects the aforementioned noise can mean
some data is hidden, but global fixes are only going to move the noise
around, not remove it.
You see, the basic assumption appears to be that all we need to do is fix
scatter due to different instruments [I include the eye-brain combination
as an "instrument"] and conditions, and if you allow weightings or
de-weightings due to observer frequency or skill or what have you, the
noise will magically go away.
This neglects the fact that a not insignificant proportion of all visual
archives are so much crap, and irrecoverable no matter how clever the
analytical technique used.
And I thoroughly expect to be strung up for the last sentence.
Cheers
John
JG, UK
Return to Daisaku Nogami
vsnet-adm@kusastro.kyoto-u.ac.jp