This page is outdated

This page is outdated. So are some other pages about feeding and searching data in eXist.

See new pages

http://trac.talia.discovery-project.eu/wiki/ExistFeedingSpecs

http://trac.talia.discovery-project.eu/wiki/ExistSearchRequest

http://trac.talia.discovery-project.eu/wiki/ExistNormalSearchResult

http://trac.talia.discovery-project.eu/wiki/ExistMacrocontributionSearchResult

http://trac.talia.discovery-project.eu/wiki/ExistMediaSearchResult




XML Format for documents in eXist

(see also #585 and ExistSearchDescription)

Original Format

The main structure is

<version>
  <metadata>
      ...
      <critical_edition>
          ...
      </critical_edition>
  </metadata>
  <data>
      ... html ...
  </data>
</version>

Here's the full version:

#!
<?xml version='1.0' encoding='utf-8'?>
<document>

  <!--
  many contributions in hyper
  have several displayed versions,
  e.g, linear layer 1 vs diplomatic layer 1.
  the exist db contains one xml document
  for each displayed version.
  the "version" element represents the version.
  there are historical reasons for the extra
  "document" element around the "version" element
  -->
   <version>

      <metadata>
           <!--
          it's useful to have a human-recognizable id
          for the documents stored in an eXist db.
          we use as an id a string
          made from the siglum,
          the version type and the layer number
          -->
           <id>...</id>
           <!-- contribution type, e.g, "essays" -->
                    <type>...</type>
                    <siglum>...</siglum>
                    <authors>
              <author>
                  <lastname>...</lastname>
                  <firstname>...</firstname>
              </author>
              ...
          </authors>
                    <title>...</title>
                    <!--
          a "standard" title here if there is
          no title in the postgresql db.
          not used for search
          -->
                    <standard_title>...</standard_title>

          <!-- iso two letter language code -->
                   <language>...</language>
                   <!-- iso data format, e.g, "2007-11-27" -->
                   <date>...</date>
                   <!--
          critical edition data -
          one element per critical edition
          the contribution belongs to
          -->
                        <critical_edition>
              <!-- siglum of the critical edition -->
              <siglum>...</siglum>
              <!-- name of the critical edition -->
              <name>...</name>
              <!-- siglum of the work -->
              <work_siglum>...</work_siglum>
              <!-- name of the work -->
              <work_name>...</work_name>
              <!-- name of the related material -->
              <related_material_name>...</related_material_name>
              <!--
              the related material's position within the work
              (not within the critical edition)
              -->
              <position_within_work>...</position_within_work>
              <!-- siglum of the related material -->
              <related_material_siglum>...</related_material_siglum>
              <!--
              a hierarchical position thing,
              made up of work siglum, underline,
              and position within work (5 digits, zero filled),
              e.g, "WS_00013"
              -->
              <position>...</position>
              <!--
              an element to allow "All" search.
              similar to the "position" element.
              made up of "all", underline,
              and position within work (5 digits, zero filled),
              e.g, "all_00013"
              -->
              <all_position>...</all_position>
          </critical_edition>
             </metadata>
        <!--
      the "data" element contains the html
      of the contribution version.
      the html has been "cleaned" and is well-formed xhtml.
      if html is not available, pure text is stored.
      (or nothing when there is no text)
      -->
           <data>...</data>
     </version>
</document>