Search requests
Search requests are executed with the Search servlet.
If the site is www.mycommunity.org, the port for the web application server is 8080, and the servlets reside in a web application named myapplication, the url for a search is the following:
http://www.mycommunity.org:8080/myapplication/Search
The request is a POST request.
The search servlet can do three different kinds search - "normal" search, macrocontribution search and media search - described below.
If the search request is successful, an XML-formatted result and an HTTP status code of 200 is returned. (A search that doesn't find anything because the search criteria don't match any data, is also regarded as successful.)
If something goes wrong, and the search servlet believes this is the fault of the request(er), an HTTP status code of 400 is returned, together with an error message.
If something goes wrong, and the search servlet believes that the fault lies with the server or service (e.g, the servlet itself), an HTTP status code of 500 is returned, together with an error message.
Query parameters for "normal" search
"Normal" search has not yet (2008-09-16) been implemented in Talia. In Hyper there are two kinds of "normal" search - Full Mode Normal Search and Research Studies Normal Search.
In Talia the request for a "normal" search is a POST request with the following parameters:
Search type parameter
search_type=normal
The value normal means that the search is a "normal" search and not a macrocontribution or media search. If one forgets this parameter, or doesn't want to use it, one can leave it out. The servlet will normally see from other parameters that a "normal" search is what was intended. To be more precise: Unless there are any mc* parameters present, the search will be assumed to be a "normal" search.
Full text search parameters
words
operator
(distance)
The words parameter contains zero or more words and/or phrases.
The phrases must be enclosed in double quotes. That's what makes the system understand they're phrases.
Example: The string
"green velvet" pills thrills "brain cells"
will be interpreted as the two words
pills
thrills
and the two phrases
green velvet
brain cells
The words/phrases may contain wildcards ? and *, with ? matching one character (any character), and * matching zero or more characters.
If there is more than one word/phrase they should be separated by spaces, so the system can interpret (segment) them correctly.
The operator parameter contains a Boolean operator for combining the words/phrases, if there is more than one word/phrase.
Values: and, or (default).
There is also a third operator parameter value distance that can be used for proximity search, i.e, to search for words/phrases within a certain distance of each other. The distance itself can be specified in the distance parameter. Default value is 1. Note that the order of the search words/phrases is relevant, which might not be ideal for a proximity search. There are currently no plans to implement proximity search in the user interface of Talia.
Metadata search parameters
Author parameter
author
String containing one or more names - first names and/or last names.
Example: Either of these values will find versions by Inga Gerike:
gerike
inga
inga gerike
gerike inga
This value
gerike williams
will find versions where Gerike or Williams or both is the author, possibly together with other authors. I.e, it is like a Boolean "or" search. (It was discussed if it rather should find versions with Gerike and Williams both present, i.e, as co-authors, i.e, like a Boolean "and" search, but "or" is what was decided. It's easy to change, though.)
Actually each search word xyz is secretely changed into *xyz* by the servlet. Therefore also these values will find versions by Inga Gerike:
geri
nga
r
There are historical reasons for this. But it should perhaps be changed.
Title parameter
title
Similar to author search. Search word xyz secretely changed into *xyz*.
But in contrast to author search the title search is a Boolean "and" search - all words/phrases specified must be present for versions to be found.
Language parameter
language
Value: ISO two-letter language code.
Only one value is allowed. (But it is fairly easy to change the servlet to accept more than one value.)
(If the parameter is absent the search will be for all languages. Some contributions in Talia haven't got a language value, and searching without the language parameter is the only way to find them.)
Type parameter
type[]
This parameter makes possible search on one or more contribution types.
The parameter can occur several times.
Possible values are
essay
comment
transcription
edition
facsimile
path
(If the parameter is absent the search will be for all types.)
Combining all the different parameters
In general all the parameters are combined with Boolean AND.
Ordering parameters
orderby
direction
Possible values for the orderby parameter: author, title, language, date, type, siglum. Default: author.
What the result is ordered by.
When the result is ordered by author, those versions/versions that have more than one author occur several times in the result list, once for each author.
(In Hyper the result from a full text search (or a search that combines full text search and search on metadata) is presented in a format with headers for author, title and type. The headers are clickable, and can be used to change the order. The result list after a pure metadata search is presented in a different format that has clickable headers for all six possibilities. So the result list after a pure metadata search can be ordered in more ways.)
The direction of the ordering.
Possible values for the direction parameter: ascending (default) or descending.
Parameters for retrieving long result list in "pages" or "chunks"
limit
How many hits to present at a time,
Aka the "page" or "chunk" size.
Value: A positive integer or all (default).
page
When the result is presented in "pages"/"chunks", which "page"/"chunk" to show. Numbered from 1 and upwards. Default: 1.
Query parameters for macrocontribution search
In Talia macrocontribution search is used for critical editions. Presumably it will also be used for other, future kinds of macrocontribution.
The request for a macrocontribution search (e.g a critical edition search) is a POST request with the following parameters:
Search type parameter
search_type=mc
The value mc means that the search is a macrocontribution search and not a "normal" or media search. If one forgets this parameter, or doesn't want to use it, one can leave it out. The servlet will normally see from other parameters that a macrocontribution search is what was intended. To be more precise: As long as at least one of the mc* parameters is present, the search will be assumed to be a macrocontribution search.
Full text search parameters
words
operator
(distance)
These are full text search parameters. They work as in the "normal" search.
Macrocontribution search parameters
mc
mc_single
mc_from[]
mc_to[]
These are parameters to search in the structure of the macrocontribution.
mc
The first parameter mc is used for a search on the whole macrocontribution, i.e, for all contributions belonging to the macrocontribution.
The value is the URI of the macrocontribution.
There can only be a single value.
The mc parameter should not be combined with the other mc* parameters. If it is, the servlet will assume it is unnecessary, and disregard it.
mc_single
The second parameter mc_single is used to search for the contributions belonging to a particular material in the structure, e.g a certain book, chapter, page, paragraph or zone. The search value is the URI of that material in the macrocontribution.
In Talia critical edition advanced search this parameter is be used for searching for a single value from the left menu. The parameter will never be combined with the other mc* parameters.
In general, however, the servlet accepts the mc_single parameter being combined with the mc_from[]/mc_to[] parameters. They will be combined with a Boolean "or".
mc_from[] and mc_to[]
The last parameters mc_from[] and mc_to[] are used to search for a particular "slice" or "range" or "subsequence" within a particular material, e.g contributions for paragraphs 22 through 33 of a certain book.
The search values are URIs. In the example above they are URIs for the paragraphs 22 and 33.
The mc_from[]/mc_to[] parameters can occur many times, so one can search for many slices simultaneously. But the parameters must occur in pairs, i.e, the two parameters must have the same number of occurrences.
The parameters have no default values. If there is a non-empty mc_from value, the corresponding mc_to value must be non-empty too, and vice versa. (The servlet tolerates that _both_ values are empty, though, but throws such empty pairs of values away.)
Note that only "low level" slices can be searched. With "low level" here is meant what the user can select in the "from" and "to" listboxes in the Talia critical edition advanced search form - typically paragraphs or pages. So, typically, one cannot do a slice search on chapters.
(Note that the servlet can search for a slice that crosses book boundaries (e.g paragraph 22 in book 3 through paragraph 11 in book 5). But the Talia critical edition advanced search interface doesn't allow this kind of search.)
(Note that URI values are not directly suitable for slice search. Therefore the search servlet will first translate the URIs to values that _are_ suitable - the so-called "search keys". These "search keys" are stored in the eXist database data, and contributions in a macrocontribution have neatly increasing values. The servlet retrieves "search keys" by doing one or more preliminary searches in the eXist database. Afterwards it will do the real search with these "search keys".)
In Talia critical edition advanced search the mc_from[]/mc_to[] parameters will never be combined with the other mc* parameters.
In general, however, the servlet accepts the mc_from[]/mc_to[] parameters being combined with the mc_single parameter. They will be combined with a Boolean "or".
Version search parameters
preferred
version_type
version_layer
These are parameters necessary to retrieve only the relevant versions of the contributions.
For Nietszche use
preferred=true
Presumably for Wittgenstein one will use e.g
version_type=diplomatic
Combining all the different parameters
In general all the parameters are combined with Boolean AND.
Ordering parameters
orderby
What the result is ordered by.
Value: mc (default), i.e macrocontribution order.
If any of the orderby values for "normal" search is used, the result might not make sense.
direction
The direction of the ordering.
Values: ascending (default) or descending (presumably never used in practice).
Query parameters for media search
The request for a media search is a POST request with the following parameters:
search_type=media
The value media means that the search is a media search and not a macrocontribution or "normal" search. If one forgets this parameter, the servlet can sometimes see from other parameters that a media search is what was intended, but not always.
Full text search parameters
Full text search can be done on the abstract and title fields.
The following parameter is used to search on abstract:
words[]
The following parameter is used to search on title:
title_words[]
These are parameters that can occur many times. The values are combined with a Boolean AND.
Each value in turn can contain many words (or phrases). These values are combined with a Boolean OR.
Here is a more thorough explanation for the words[] parameter. The title_words[] parameter is completely similar, and is not explained:
Each value in the words[] parameter contains zero or more words and/or phrases.
The phrases must be enclosed in double quotes. That's what makes the system understand they're phrases.
Example: The string
"green velvet" pills thrills "brain cells"
will be interpreted as the two words
pills
thrills
and the two phrases
green velvet
brain cells
The words/phrases may contain wildcards ? and *, with ? matching one character (any character), and * matching zero or more characters.
If there is more than one word/phrase they should be separated by spaces, so the system can interpret (segment) them correctly.
The words/phrases are combined with Boolean OR, as said before.
E.g, if there are two title_words[] values, each with two words like this
change rearrange
pills thrills
the interpretation will be
(change OR rearrange) AND (pills OR thrills)
Keyword search parameters
keyword[]
This parameter can occur many times. The values are combined with a Boolean AND.
Each value must be the exact value of a keyword, not some substring.
Combining all the different parameters
In general all the parameters are combined with Boolean AND.
Ordering parameters
orderby
What the result is ordered by.
Possible values for the orderby parameter: author, title. Default: title. (Also other values are possible, but will not be used in practice.)
(RAI media contributions might have author elements with the whole name stored in the firstname element - starting with first name. Ordering on author will therefore order these contributions on first name.)
direction
The direction of the ordering.
Values: ascending (default) or descending (presumably never used in practice).
