There Is No Global Keywords Standard
So, with a
LarchDataDescription, you'd like to be able to say: "Here's a blob. Here's it's data descriptor. When was this data taken?"
For some data, this might be keyed as "DATE" in some as "CREATED" in others just in the time-stamp on the file. So, how can make sure to provide a rich enough set that users don't have to search for every possible keyword and yet still not stomp on specific ones the instrument may use.
Part of this can be solved with a simple prefixing scheme. For example, we could pick a set of canonical names (xref.
IANA). None of our names would contain any colons. We can index a DATE from WASP as: CREATED, :DATE, and WASP:DATE. Then, users can explicitly search based on actual keywords DATE, WASP-specific DATE tags, or our canonical CREATED. We would keep our CREATED in whatever timestamp format we deem appropriate (iso8601, swatch-time, you-name-it). We would probably want to keep the :DATE and WASP:DATE in whatever format most resembles what WASP would use to present that information to the user. Of course, doing simple 'BEFORE' and 'AFTER' on instrument-specific renditions of the date may be nigh impossible.
Also, does it make sense to have error-bars on numeric ranges? These people keep track of date and time, but these only keep it down to date.
--
PatrickStein - 25 Mar 2005
It might be helpful to offer the user who is ``introducing'' a data set to the system a selection of keywords combined with a capability to create and define a new keyword where none suitable exists. A rule like ``you can only attach keywords from the list'' could be imposed, with the ability to make a new one and put it in the list. Would this keep the keyword dictionary from becoming chaotic or would it keep people from using keywords?
One could also use a crawler to build list of data sets that use each keyword. That might be helpful in constructing a keyword translator.
--
HarveyRhody - 26 Mar 2005