This page is outdated
This page is outdated. So are some other pages about feeding and searching data in eXist.
See new pages
http://trac.talia.discovery-project.eu/wiki/ExistFeedingSpecs
http://trac.talia.discovery-project.eu/wiki/ExistSearchRequest
http://trac.talia.discovery-project.eu/wiki/ExistNormalSearchResult
http://trac.talia.discovery-project.eu/wiki/ExistMacrocontributionSearchResult
http://trac.talia.discovery-project.eu/wiki/ExistMediaSearchResult
Hyper servlet parameters (see also #585)
Documentation of the query parameters for the servlet for "normal" search in Hyper (aka StandardSearch?). Contains also some comments about the search forms.
Øystein Reigem, AKSIS/WAB, 2008-06-13
Preface
Be aware of the distinction between contributions and "versions". Many of the contributions in Hyper (and Talia) are stored as XML files, but shown to the user transformed to HTML. One and the same XML contribution can have several displayed/transformed HTML versions. Other contributions are stored as e.g HTML or PDF files, and therefore have just one displayed version. But in general we distinguish between contributions and versions.
The "normal" search forms in Hyper offer full text search, or metadata search, or a combination thereof. If the user fills in full text criteria (words and/or phrases) the search is done in eXist, and the search retrieves versions. If the user fills in only metadata criteria (e.g, author) the search is done in postgreSQL, and the search retrieves contributions. This will be changed, so that all search is done in eXist - at least in Talia. (Hyper development is closed, but it might be changed in Hyper since some development work for Talia still must be done in Hyper.) But a search with full text criteria will still retrieve versions, and a pure metadata search will retrieve contributions.
Currently the servlet retrieves versions, not contributions. Since this document covers the current state of the search servlet and its use in the current Hyper, I consistently use the word "version" below, not "contribution".
Search criteria
words
String containing one or more words and/or phrases.
The phrases must be enclosed in double quotes. That's what makes the system understand they're phrases.
Example: The string "green velvet" pills thrills "brain cells" will be interpreted as the two words pills thrills and the two phrases green velvet brain cells
The words/phrases might contain wilcards "?" and "*".
If there is more than one word/phrase they should be separated by spaces, so the system can interpret (segment) them correctly.
operator
Boolean operator for combining the words/phrases, if there is more than one word/phrase.
Values: "and", "or" (default).
In the pages for normal search in Hyper the operator is not exposed, but is hardcoded as "or".
author
String containing one or more names - first names and/or last names.
Example: either of these values will find versions by Inga Gerike: gerike inga inga gerike gerike inga
This value: gerike williams will find versions where Gerike or Williams or both is the author, possibly together with other authors. So it's like a boolean 'or' search. (It was discussed if it rather should find versions with Gerike and Williams both present, i.e, as co-authors, i.e, like a boolean 'and' search.)
Actually each search word xyz is secretely changed into *xyz*. Therefore also these values will find versions by Inga Gerike: geri nga r I'd like to change this. That's easy. (Why the search is done this way? Compatibility reasons. Currently - if the user fills in only metadata search criteria, and no full text criteria (words/phrases), the search is done in postgreSQL and not in eXist. Danilo implemented this metadata search, and it's done with an SQL LIKE. He (or Paolo?) thought LIKE was better than an exact search. So I made a LIKE-like search for eXist too. Personally I believe xyz* would be better than *xyz* (i.e, the user must key in the _beginning_ of the name). Lately Danilo has been convinced all search should be done in eXist. We assume that also for Talia all search should be done in eXist. Doing all search in eXist makes xyz* search possible. But all-search-in-eXist has not been implemented in Hyper yet. Of course it might never be implemented in Hyper since development of that version has been closed. But at least I will need Hyper to develop it, since Talia isn't ready.)
title
Similar to author search. Search word xyz secretely changed into *xyz*.
But in contrast to author search the title search is a boolean 'and' search - all words/phrases specified must be present for versions to be found.
Note that many versions in Hyper have no title. In the result list a "standard" title is generated for them. We might store the standard title. Then it would be searchable. The problem is that the standard title is language dependent.
language
Value: ISO two-letter language code.
In Hyper the value is selected from a dropdown listbox. The box is not multi-select, so one can only select one value, and therefore only search for one language at a time. Or one can choose the default "null" selection (a blank value in the listbox) which will not cause a search on language, and therefore find versions with any language value.
I think the servlet/query could handle more than one language value, or at least easily be changed to do so.
Note that many versions in Hyper have no language value. These will currently only be found with the "null" selection.
essays, comments, transcriptions, editions, faxes, paths
These parameters make possible search on one or more types.
Value: "yes" or "no". "no" is default, but if none of the parameters are set, all types will be searched.
In Hyper each of these parameters has a checkbox. If none of the checkboxes is ticked when the Search button is pressed, the interface ticks them all before doing the search.
We might like to consider an alternative solution, where instead of these six parameters we have one single "type" parameter, containing one or more of the values essay comment transcription edition fax path
status field
(The search forms in Hyper also contain a 'status' field - a dropdown with values "accepted", "submitted", "published" and "refused". But only published versions are stored in eXist, so only published versions are searchable. If the user selects something else than "published" in the list box, a different page should appear, but this was never implemented (Danilo's responsibility).)
Ordering
orderby
Values: "author", "title", "language", "date", "type", "siglum". Default: "author".
What the result is ordered by.
When the result is ordered by author, those versions/versions that have more than one author occur several times in the result list, once for each author.
(In Hyper the result from a full text search (or a search that combines full text search and search on metadata) is presented in a format with headers for author, title and type. The headers are clickable, and can be used to change the order. The result list after a pure metadata search is presented in a different format that has clickable headers for all six possibilities. So the result list after a pure metadata search can be ordered in more ways.)
direction
The direction of the ordering.
Values: "ascending" (default) or "descending".
Retrieving long result list in "chunks" or "pages"
limit
How many hits to present at a time,
Aka the "chunk" or "page" size.
Value: A positive integer or "all" (default).
page
When the result is presented in "chunks"/"pages", which "chunk"/"page" to show. Numbered from 1 and upwards. Default: 1.
