The format of a result from "normal" search

Here is - for pedagogical reasons - a step-wise explanation of the format.

The result from a "normal" search has a sequence of <talia:entry> elements, one for each found contribution, or contribution version:

<?xml version="1.0" encoding="UTF-8"?>
<talia:result xmlns:talia="http://trac.talia.discovery-project.eu/wiki/Exist#" total="123">

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    <talia:entry>...</talia:entry>

    ...

</talia:result>

If the search contains full text criteria, the returned entries represent contribution versions.

If the search was for metadata criteria only, the returned entries represent contributions.

The returned entries themselves have the following format:

<talia:entry>

    <talia:metadata>
    
        <talia:maintype>contribution</talia:maintype>
        <talia:type>transcription</talia:type>
        <talia:subtype>hnml</talia:subtype>
        <talia:uri>http://a.b.c/ccc</talia:uri>
        <talia:authors>
            <talia:author>
                <talia:lastname>Williams</talia:lastname>
                <talia:firstname>Andrew</talia:firstname>
            </talia:author>
            ...
        </talia:authors>
        <talia:title>...</talia:title>
        <talia:standard_title>...</talia:standard_title>
        <talia:language>it</talia:language>
        <talia:date>2006-03-03</talia:date>
        
    </talia:metadata>

    <talia:excerpt>...</talia:excerpt>
    
</talia:entry>

The <talia:excerpt> element is present only when the search contains full text criteria. In that case <talia:excerpt> contains relevant excerpts of the version text. The purpose of the excerpts is to show the search words that were found in the text, with a little bit of context (a certain number of characters on each side of the word), and with the words themselves highlighted - all to help the user to decide if the hit is a relevant one or not. The highlighting must of course be done by the search result user interface, but the words to be highlighted are marked with <exist:match> elements. If there are too many word hits in the text, only the first few are present in the excerpts.

(The excerpt format should perhaps be described in more detail.)

Finally, the root element <talia:result> contains attributes that tell the size of the result and give relevant information if the result has been divided into "pages" ("chunks"). The attributes are:

total
limit
page
first
last

total tells the number of entries founds (the number of found contributions or versions). If the search request asked for the result to be divided into pages, the total value might be larger then the number of entries returned.

limit is the page size, if the search request asked for the result to be divided into pages. Otherwise it is the special value all.

page is the number of the page returned, if the search request asked for the result to be divided into pages. Otherwise it is 1.

first is the number of the first entry in the page returned, if the search request asked for the result to be divided into pages. Otherwise it is 1.

last is the number of the last entry in the page returned, if the search request asked for the result to be divided into pages. Otherwise it is equal to the total value.

(Note: At the time of writing (2008-09-17) the first and last values might not be correct. If the request asks for the last page, which usually is smaller than the other pages, the last value will be like if the last page were as large as the others. It doesn't reflect the real, smaller size of the page. Also - if the request asks for a non-existing page - a page beyond the end of the result - the first and last values will be as if that page existed.)