[Message Prev][Message Next][Thread Prev][Thread Next][Message Index][Thread Index]

[vsnet-chat 2598] Fixing catalogues



     Mati Morel suggested that VSnet might be a place to "publish" various
corrections and convoluted errors one finds in the literature.  I agree to
some extent, but would also encourage him and others who find problems to
also contact Gerard Jasniewicz (gerard@simbad.u-strasbg.fr), who is the
official "fixer" of SIMBAD.  Most professional astronomers don't know that
VSnet exists, so future journal articles will rarely take into account 
corrections sent only to this forum, whereas almost everybody has a look at
SIMBAD before they publish a paper.  When you write, it is distinctly helpful
to provide the SIMBAD folks with a paper trail in the literature (okay, they
pretty much insist on it) so they have a way of double-checking your
correction.  One of the common mistakes made by folks sending in "corrections"
is some sort of misunderstanding or misinterpretation of a datum---often the
only "error" is a mistake made by the person making the complaint!
     Although you might find problem cases or outright errors in old or new
catalogues, it is best to check SIMBAD to see what they have, which will tend
to have been gone over to some extent.  The error you find may have already
been found and corrected, such as we found yesterday for the carbon/F-star
pair.

     I tend to regard SIMBAD as representative of the state of astronomical
knowledge as regards basic data (position, magnitude, spectral type, motion,
radial velocity, etc.).  If something doesn't show up in the SIMBAD header
for a star (say), even if there's a well-known catalogue that contains that
datum, then I consider it to be "unknown".  The reason simply is that nowadays
few people will look any further than SIMBAD (and/or ADS, which shares the
SIMBAD bibliographic database) in plowing through the literature, and will not
have any comprehensive knowledge of the literature beyond this, particularly
outside a narrow specialty.  It's easy to be sarcastic about people being
dumb/lazy, but with a little thought one can anticipate zones of ignorance and
facilitate things.
     Similarly I regard the folks who maintain SIMBAD to be "dumb" in the
sense that they cannot be expected to be hep to every arcane area of astronomy.
I do not mean any disrespect by this.  That the database is as good as it is
with so few people working on it (the core group is only about a dozen people)
is fairly amazing, considering how heavily bombarded they are with the current
literature, not mention such things as 2MASS or DENIS returning at minimum
a million new entries for them to deal with.  Thus in my star-cataloguing work
I try to do all the things they would do (or would want done if they had the
time) in preparing a file for inclusion in their system.  It simply reduces
work at their end, and serves to get the results into the system faster.  
     Perhaps the most common type of "error" in SIMBAD are multiple entries
for the same object.  This is a common feature of all such large "compilation"
databases, and could be 10 or 20 percent of the total.  I report these on a
regular basis to Jasniewicz without much documentation, since it is evident by
a close match in coordinates that the two (or more) entries pertain to a single
object.  My recent IBVS lists of variable star coordinates (IBVS 4719 et seq.)
included such identifications as I could find so that the entries could all be
merged in SIMBAD.  Such work really helps clean up the database.  Another way
of linking is by spectral output:  the white dwarf is the PG star is the UV
source detected on a balloon flight.  I've been burned in the past, however, by
things like proper motion, widely-separated pairs of similar brightness, and
other 'gotchas' as a result of not checking every way I might be wrong.
Luckily Gerard does this checking, too, and occasionally asks for clarification
or simply disagrees---which sets me to figuring out what I did wrong, and I
learn something!  Sometimes there's still an error, but a more subtle one than
I first thought.
     Most true errors in SIMBAD are simply errors from source papers and
catalogues that have been faithfully copied into the database.  Again, it is
not their business to vet the data in every paper, but to index the contents
of the literature in some recoverable way (even if what's there is wrong).
Thus if you can point them to a specific paper that offers a correction to
something in their object headers, then by all means cite it to them.  I've
found cases where a minor error in the literature leads to a jumble in SIMBAD.
Take an imaginary case of a carbon star that has a nearby F-type star where
the F spectral type was misattributed, and papers mentioning one or the other
star will end up in a single entry in the database.  Something like an error
in coordinates can yield foul-ups like this.  The job then is to sort out
the correct precise coordinates of _both_ stars, figure out which of the
(usually several) names given for the object apply to which star, and also to
look at the bibliography to decide which papers apply to which star.  Since all
too often you end up having to actually look at the papers involved, a complete
library is necessary (the papers you want of course won't be scanned into ADS).
I write up a report detailing all this and send it Gerard.
     There are also blunders originating with SIMBAD, but they are a distinct
minority.  I recently looked up some stars from a paper and found that 
somehow 20h of RA had been subtracted from the positions (they were shown at
3h RA when they were actually at 23h!).  The source paper also supplied
equinox 2000 positions (clearly indicated as such), but the positions had
nevertheless been precessed forward by 50 years, so that the SIMBAD header
showed eq. 2050 positions in the line for eq. 2000, and eq. 2000 positions in
the line for 1950---this is in addition to the gross RA error!  This snafu
affected some 400 stars from one paper, and I suspect Gerard's probably put
that one on his backburner for "someday".
     In another recent case, I got positions for stars with photometric data
in a field where two sets of stars had been observed by the same authors.
The stars were numbered and identified only on charts.  Because the two series
had been assigned the same serial numbers, whoever added them to SIMBAD ended
up giving different stars identical positions but different data, and
subsequent follow-up observations of some of the stars (and the addition of
the Tycho catalogue) had really mangled things.  Only three or four dozen
stars, luckily, but sorting out that one earned me the title "grand nettoyeur
des catalogues" from Gerard.

\Brian

VSNET Home Page

Return to Daisaku Nogami


vsnet-adm@kusastro.kyoto-u.ac.jp