Scientist's Notebook Entry XML
Things it needs to support:
- some header information:
- what notebook(s) it is part of
- entry dates: (creation, modification, etc.)
- some minimal formatting:
- paragraphs
- bold/italics
- font
- superscript/subscript
- headlines
- references to external things
- references to other entries
- references to drag-n-dropped stuff
Initial thoughts:
<!ELEMENT entry ( entryHeader, entryBody )>
<!ATTLIST entry version CDATA #FIXED "1.0">
<!-- next two only used for annotations -->
<!ATTLIST entry parentEntryId NMTOKEN #IMPLIED>
<!ATTLIST entry parentElementId NMTOKEN #IMPLIED>
<!ELEMENT entryHeader ( notebookRef+, changeLog+ )>
<!ATTLIST entryHeader tagIDs NMTOKENS #IMPLIED>
<!--
These would be references to either global
categorizations or user-specific categorizations.
-->
<!ELEMENT notebookRef EMPTY>
<!ATTLIST notebookRef nid NMTOKEN #REQUIRED>
<!ELEMENT changeLog (#PCDATA)>
<!ATTLIST changeLog authorId #REQUIRED>
<!ATTLIST changeLog date CDATA #REQUIRED>
<!ATTLIST changeLog version NMTOKEN #REQUIRED>
<!ATTLIST changeLog tags NMTOKENS #IMPLIED>
<!ENTITY % entry.font
"( strong | emph | sup | sub | size | face | color )"
>
<!ENTITY % entry.format
"( indent | framed | list | table | reference )"
>
<!ELEMENT entryBody ( section* | paragraph* )>
<!ELEMENT section ( heading, paragraph*, annotation* )>
<!ATTLIST section id NMTOKEN #REQUIRED>
<!ELEMENT heading ( #PCDATA | %entry.font; )*>
<!ELEMENT paragraph (
( #PCDATA | %entry.font; | %entry.format; )*, annotation*
)>
<!ATTLIST paragraph id NMTOKEN #REQUIRED>
<!ELEMENT annotation ( entryHeader, entryBody )>
<!ATTLIST annotation id NMTOKEN #REQUIRED>
<!--
So, when viewing an entry, it would be nice to have the
annotations included inline. But, in reality, the annotations
should be stored separately and just reference the parent
paragraph, section, or annotation id.
-->
<!ENTITY % std.fontElt "( #PCDATA | %entry.font; )*">
<!ELEMENT strong %std.fontElt;>
<!--
Should we forbid stuff like this:
<strong><strong>foo</strong></strong>
-->
<!ELEMENT emph %std.fontElt;>
<!ELEMENT sup %std.fontElt;>
<!ELEMENT sub %std.fontElt;>
<!ELEMENT size %std.fontElt;>
<!ATTLIST size ratio NMTOKEN #REQUIRED>
<!ELEMENT face %std.fontElt;>
<!ATTLIST face (serif|sans|mono) "serif">
<!ELEMENT color %std.fontElt;>
<!ATTLIST color rgb CDATA #REQUIRED>
<!ELEMENT indent (
section | paragraph | %entry.format;
)*>
<!ELEMENT framed (
section | paragraph | %entry.format;
)*>
<!ELEMENT list ( item+ )>
<!ELEMENT item ( list | ( #PCDATA | %entry.font; )* )>
<!ELEMENT table ( ( thead?, tbody ) | tr+ )>
<!ELEMENT thead ( tr+ )>
<!ELEMENT tbody ( tr+ )>
<!ELEMENT tr ( td+ )>
<!ELEMENT td (
paragraph+
| ( %entry.format;+ )
| ( #PCDATA | %entry.font; )*
)>
<!ELEMENT reference EMPTY>
<!ATTLIST reference url CDATA #REQUIRED>
<!ATTLIST reference hash CDATA #REQUIRED>
--
PatrickStein - 19 May 2005
I spent a fair bit of time yesterday reading up on W3's
XPath/
XPointer and
XLink recommendations. My goal was to see how we should link an annotation to the appropriate place in the entry.
The XPath standard defines the expression language used in XSLT to refer to particular elements. For example, we could use the expression
id("foo")/para[position()=3]/@id to reference the id tag of the third
tag under the item whose
id attribute is
"foo". The goal XPath is to be able to reference any tag or attribute within
any XML document.
The XLink standard, on the other hand, is geared toward creating links between sections of XML documents. It relies on either coordination between the two documents or just the values of the
id attributes and/or XLink
resource type attributes. in the documents. For example, one could link to the element whose
id attribute is
"foo", but one could not link to the third paragraph under the element whose
id attributes is
"foo".
I think XPath is more trouble than it would be worth here. We can easily make use of the XLink resource tags to identify sections of entries. And, we can easily build upon the XLink
extended type to make a linkage between an annotation, the entry it annotates and its parent annotation or section.
This afternoon, I will get us a source-control repository going for this stuff, get the various bits of XML in these pages worked into source-control, and get these DTDs into source-control.
--
PatrickStein - 02 Jun 2005