The format of a result from media search

Here is - for pedagogical reasons - a step-wise explanation of the format.

The result from a media search has a sequence of <talia:entry> elements, one for each found contribution, or contribution version:

<?xml version="1.0" encoding="UTF-8"?>
<talia:result xmlns:talia="http://trac.talia.discovery-project.eu/wiki/Exist#" total="123">

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    ...

</talia:result>

The returned entries have the following format:

<talia:entry>

    <talia:metadata>
    
        <talia:maintype>AvMedia</talia:maintype>
        <talia:type>transcription</talia:type>
        <!-- in general here might be a <talia:subtype> element,
             with value "audio" or "video",
             but for the RAI media contributions
             we have no such value and element -->
        <talia:uri>http://a.b.c/ccc</talia:uri>
        <talia:url>http://d.e.f/fff</talia:url>
        <talia:authors>
            <talia:author>
                <talia:firstname>Andrew Williams</talia:firstname>
                <talia:lastname></talia:lastname>
                <!--
                in general here might be a <talia:uri> element,
                but the RAI media contributions have no URI¨
                -->
            </talia:author>
            ...
        </talia:authors>
        <talia:title>...</talia:title>
        <talia:standard_title>...</talia:standard_title>
        <talia:language>it</talia:language>
        <talia:date>2006-03-03</talia:date>
        <talia:creation_date>2000-11-22</talia:creation_date>
        <talia:length>...</talia:length>
        <talia:bibliography>...</talia:bibliography>
        <talia:keywords>
            <talia:keyword>...</talia:keyword>
            ...
        </talia:keywords>
        
    </talia:metadata>

    <talia:abstract>...</talia:abstract>
    
</talia:entry>

If the <talia:title> or <talia:abstract> elements have been searched, the search on these elements was a full text search, and in that case they will contain <exist:match> tags that mark the search words found. These "match" tags can be used by the user interface to highlight the search words when the title and abstract are shown. We are discussing if a different tag should be used, e.g an "explicit" HTML tag like <b>. The <exist:match> tag is one delivered by eXist itself, but the servlet might transform it into something else.

(Note: If a search word occurs in both title and abstract, but was only searched in title, it will only get "match" tags in the title element, not in abstract; and vice versa.)

It has not been discussed what to do with long result lists, so the below might not be valid.

Finally, the root element <talia:result> contains attributes that tell the size of the result and give relevant information if the result has been divided into "pages" ("chunks"). The attributes are:

total
limit
page
first
last

total tells the number of entries founds (the number of found contributions or versions). If the search request asked for the result to be divided into pages, the total value might be larger then the number of entries returned.

limit is the page size, if the search request asked for the result to be divided into pages. Otherwise it is the special value all.

page is the number of the page returned, if the search request asked for the result to be divided into pages. Otherwise it is 1.

first is the number of the first entry in the page returned, if the search request asked for the result to be divided into pages. Otherwise it is 1.

last is the number of the last entry in the page returned, if the search request asked for the result to be divided into pages. Otherwise it is equal to the total value.

(Note: At the time of writing (2008-09-17/2008-10-28) the first and last values might not be correct. If the request asks for the last page, which usually is smaller than the other pages, the last value will be like if the last page were as large as the others. It doesn't reflect the real, smaller size of the page. Also - if the request asks for a non-existing page - a page beyond the end of the result - the first and last values will be as if that page existed.)