= Metodi utilizzati da Webapi = == getEdition == Metadati riguardanti le collezioni: autore, titolo, dettagli, link a descrizione bibliografica (link a pdf) ecc. Un blocco per ogni package _txt indicizzato (Mondi, Marmi). == getIntro == Contenuto statico, html. Descrizione del sito, autori, tecnologie, link a sponsors ed a doc. tecnici (bibliografia, ecc). == getHelp == Contenuto statico, html. == getMenu (contesti) == Genera il menu' di sinistra. Accetta come parametro una lista di contesti. Nessun contesto passato equivale a "tutti i contesti". Se si passa un sottoinsieme di contesti, genera il menu' di conseguenza: scrive i link ai contenuti limitandoli ai soli contesti selezionati. Restituisce: lista di tutti i contesti selezionabili per modificare il menu' stesso. links agli indici che si desidera offrire. link al box ricerca in generale: link a contenuti (indici, schede, contenuti statici, .. qualsiasi cosa) Alcuni indici sono generati automaticamente in base ad informazioni estratte dal testo: la codifica del testo prevede la marcatura di diversi generi letterari tramite l'attributo ana, dandogli valori appartenenti ad un vocabolario controllato (es: "genre_anectode", "genre_lyric" ecc). In modo del tutto analogo, l'attributo ana ospita descrizioni di riscritture ("quotation_plagiarism", "quotation_reference", ecc), di fenomeni dell'oralita' ("syntexp_proverbial", "syntexp_idiomatic" ecc), di immagini verbali ("verbfig_iconological", "verbfig_ecphrasis"). Questi indici vengono generati in piu' passi: - si ottiene la lista dei vocaboli utilizzati partendo da una radice, all'interno dei contesti selezionati. Ad esempio si domanda genre_* e si ottengono tutti i generi presenti nei marmi. In questo modo se nei marmi non sono presenti aneddoti, il menu' non visualizzera' il link. - dalla lista di vocabili effettivamente presenti nel package, si costruiscono i link, utilizzando i parametri: method=getIndex, contexts=marmi, type=genre_anectode che apre un indice del solo contesto selezionato, del tipo richiesto. Alcuni indici vengono creati allo stesso modo, ma senza richiedere la lista di vocabili presenti. Ad esempio "desc_character" e "desc_place". I link generati hanno lo stesso formato: method=getIndex, contexts=marmi, type=desc_place Un tipo differente di indice e' quello "byFeature", che richiede un primo livello ordinato alfabeticamente. Di questo tipo sono tutti gli indici dei nomi (personaggi, luoghi, fonti, altri nomi), gli indici delle opere (d'arte e letterarie), l'indice delle parole chiave, l'indice dei riferimenti iconclass. A seconda dell'indice bisogna cercare in package diversi (txt o img), ed all'interno di attributi diversi. I link generati utilizzano i parametri: method=getIndexByFeature, feature=character, contexts=mondi%2Cmarmi (indice nomi > personaggi. Per luoghi/fonti/altri cambiare feature) method=getIndexByFeature, feature=iconclass, contexts=mondi%2Cmarmi (indice riferimenti iconclass) method=getIndexByFeature, feature=keyitem, contexts=mondi%2Cmarmi (indice keyitems) method=getIndexByFeature, feature=bibl, type=text, contexts=mondi%2Cmarmi (indice opere letterarie) method=getIndexByFeature, feature=bibl, type=artwork, contexts=mondi%2Cmarmi (indice opere d'arte) Altri link generati staticamente: - indice generale : method=getIndex, contexts=mondi%2Cmarmi - indice immagini : method=getIndex, contexts=mondi%2Cmarmi, type=images == getIndex (type, contexts, level, resource, feature, filter) == - type: (optional) string which contains the type of index. It is used for every index which doesnt require an alphabetical first level - contexts: (mandatory) comma separated string of canonical names of contexts. Used to limit the index to the given contexts - level: (optional, default=1) level of the index we are requesting - resource: (optional) base64 encoded string, representing the resource under which we are building the index. If no resource is specified, the index will be built on the entire context - feature: (optional) string which contains the type of feature we are requesting. It is used for every index which require an alphabetical first level. - filter: (optional) used in combination with feature parameter, specifies the letter or word to use as filter Genera dei link verso pagine di trascrizione. Si distinguono 2 tipi di indice: ad albero (cliccando sui capitoli si vedono le sezioni figlie, cliccado su una sezione si vedono le pagine figlie di quella sezione ecc) ed appiattiti (cliccando sui capitoli si vedono tutte le pagine figlie di quel capitolo, di tutte le sezioni). Il primo livello di ogni indice elenca i contesti che rispondono ai criteri desiderati (genere aneddoto, ad esempio). Questo si ottiene passando una lista di contesti con piu' di un elemento. Viene quindi creato un link per ogni contesto che risponde ai criteri del formato: method=getIndex, context=$A assieme a tutti i parametri gia' specificati nella richiesta (type) che definiscono i criteri di ricerca. I livelli successivi (a parte l'ultimo), vengono costruiti allo stesso modo, passando un contesto, un identificatore di risorsa ed un livello. Se non si passa alcun $level esso e' impostato ad 1. Se non si passa alcuna $resource si cerca in tutto il contesto. La procedura prevede: - elencare tutte le divisioni figlie della risorsa specificata, del livello richiesto (livello = distanza dalla radice) - per ogni divisione individuata sia $res_id il suo attributo xml:id, $level il livello attuale, $C il contesto; si genera un link: method=getIndex, context=$C, level=($level+1), resource=$res_id - se non si individuano divisioni, cambiare metodo e cercare le pagine ** In questo modo alla prima chiamata, si passa il solo contesto e verranno elencate le divisioni di livello 1, cioe', per come e' definita la codifica, restituira' le DIV che dividono il testo in parti. Se per esempio la DIV che racchiude la parte 1 ha xml:id="xdv0001", il link alle sue sottodivisioni sara': method=getIndex, context=$C, level=2, resource=xdv0001 Iterando la stessa procedura con questi dati di input, si ottengono le divisioni di livello due, figlie del nodo con xml:id=xdv0001. Visto che xdv0001 e' di livello 1, si stanno cercando esattamente le sue figlie dirette, cioe' (per come e' definita la codifica) le DIV che dividono la parte 1 in capitoli. ** Scendendo iterativamente si arriva al punto in cui una divisione non presenta piu' sottodivisioni. Per come e' definita la codifica, si parte elencando le parti, poi le sue divisioni in capitoli, infine le divisioni in sezioni e per ultime le pagine figlie della sezione. Quest'ultimo accade tipicamente con un link tipo: method=getIndex, context=$C, level=4, resource=xdv1234 dove xdv1234 e' l'xml:id di una divisione che racchiude una sezione del testo originale. Non essendoci elementi DIV: - elencare i nodi di tipo PB che corrispondono ai criteri (type) e che sono figlie del nodo xdv1234 - per ogni divisione individuata, sia $res_id il suo attributo xml:id; si genera un link: method=getTranscription, context=$C, resource=$res_id (aggiungendo il valore di type se presente) The flat type indexes work like this last level, we just look for every PB node under the given resource id, at any level. Note that when a type is specified and we are looking for the PB nodes, we might need to give as result a PB node which is not directly under the segment identified by the conditions. For example, let's look for genre_anectode items and consider an XML segment like this: .. .. .. .. .. The PB element lives on top of a text page, so a page is the segment between 2 PBs, and the number of that page is in the upper one. The first index call will give you the contexts in which we can find at least a tag with that ana value. Going down the hierarcy we will reach the 4th level, where there's no more DIV to look for, thus make us switching and look for PB items. The system will return and since they are contained into an element identified by the conditions, but must also return , since that page contains a portion of text identified by the conditions. == getIndexByFeature (contexts, level, resource, feature, filter) == - contexts: (mandatory) comma separated string of canonical names of contexts. Used to limit the index to the given contexts - level: (optional, default=1) level of the index we are requesting - resource: (optional) base64 encoded string, representing the resource under which we are building the index. If no resource is specified, the index will be built on the entire context - feature: (mandatory) string which contains the type of feature we are requesting. It is used for every index which require an alphabetical first level. - filter: (optional) used in combination with feature parameter, specifies the letter or word to use as filter The first level of this kind of indexes needs at least a feature specified. It can be on of: - keyitems: looks for SEG tags with "key_item" in their ANA attribute, in the IMG type packages - iconclass: looks for DCTL:ICONTERM tags in the IMG type packages - bibl (type=artwork/text): looks for bibliografic citations in TXT type packages - character, place, as_source, other_name: refers to NAME tags with ANA attribute with values "func_character", "func_place", "as_source" or with no value at all (other_name), all in TXT type packages The first two will look into _img packages, thus will create links to that kind of resource. The first level of byFeature indexes will give a list of letters. The system will identify items with the given feature, will take their first letter, filter out duplicates and return this list. The links will look like: method=getIndexByFeature, context=$C, feature=$FE, filter=$FI where $FI is a letter (a, b, .. z) and $FE is one of the admitted features. These links will get you to the second level of these indexes: a list of words or sentences which start with that letter and verify the feature conditions. This second level is obtained in the same way: the system will identify items by the feature which starts with that letter, extract their names and return them. The links will look like: method=getIndexByFeature, context=$C, feature=$FE, filter=$FI but this time $FI is a word or sentence, like "Academici+Fiorentini" (urlencoded). The third level of these index will show the contexts in which the system can find tags which verify the conditions. This is obtained by delegating the job to getIndex(), passing the feature and filter parameters. It will work as previously described, just using different conditions to extract the needed data. Special cases are keyitems and iconclass indexes. They will behave as other byFeature index until the context level is reached. When the user wants to descend into that branch of the tree, the system will just collect the xml:id of the image sheets codified into the _img packages, and use them to create links to them, in the format: method=getImageInfo, context=$C, resource=$res_id, type=$T, filter=$FI where $res_id will look like "xml://afd/marmi_img/p034pt003pg065" (base64 encoded) and type and filter will be used to automatically highlight the part of the content we are looking for. Type will get a value based on the feature we are using and filter will contain the filter used to create this part of index. == getTranscription (resource, context, type, filter) == - contexts: (mandatory) is the context of the transcription we want to see, it's a single item. - resource: (optional) base64 encoded string, representing part of the URI of the transcription. - type: (optional) will be used to highlight automatically part of the transcription, when we come from special indexes (typed or featured) - filter: (optional) used with type, to highlight a particular word and not just a type of content Will extract a text block from the xml stream, starting from the PB tag with it's xml:id equal to the given resource, and stopping at the next PB tag. The type and filter attributes are used to automatically highlight the word or sentence the user looked for in the indexes. Indeed, if the user is coming from a particular index (let's say the "genre_anectode" one), type will reflect this and the transcription will have the SEG tags with ANA attribute = "genre_anectode" highlighted. This method involves other operations as well: - create a link to the facsimile of the current page: an URI to the image (if there's one), can be found into the CORRESP tag inside the PB tag with the given xml:id. This URI tipically looks like "img://afd-mondi_068r.jpg", and is used as $res_id in the link: method=getFacsimile, context=$C, resource=$res_id - create links to the next and previous page: the system will get from the system the parent nodes of the given PB item, and look for siblings. Their xml:id will be used as resource ids to create links which will look like: method=getTranscription, context=$C, resource=$res_id - create a link to the relative imageInfo sheet of the figure contained in the page. For each figure, the system will get its xml:id and extract the URI of the linked data, which tipically looks like "xml://afd/marmi_img/p034pt003pg065". The link will contain: method=getImageInfo, context=$C, resource=$res_id, - create a link to the relative quotation sheet, for the quotations found in the page. For each SEG element with ANA attribute which contains "quotation_*" (for example "quotation_reference", "quotation_rewriting", ...), the system will get it's xml:id and extract the URI of the linked data, which tipically looks like "xml://afd/marmi_cit/p001q001", inside a _cit package. The link will contain: method=getQuotation, context=$C, resource=$res_id == getFacsimile (resource) == - resource (mandatory): base64 encoded resource full URI (collection/context/xml:id) Will extract the image with the given URI (resource, eg: "img://afd-mondi_068r.jpg") from the relations database and build up an URL to get it. This method adds two more links in the returned HTML portion: - a link to the transcription, if there's one. This is obtained looking for the first PB element with a CORRESP attribute value equal to the given resource. The xml:id attribute of the PB tag, will be used as resource in the link: method=getTranscription, context=$C, resource=$res_id - links to previous and next facsimile images. These are built using the same procedure used to build the next and previous links in a transcription page. In first place, we get the PB's xml:id, then use it to initialize the algorithm which looks for siblings. == getQuotation (resource) == - resource (mandatory): base64 encoded resource full URI (collection/context/xml:id) Will extract a portion of XML (tipically an entire DIV tag), from the _cit (citations) package. The DIV is identified by the given resource (for example "xml://afd/marmi_cit/p001q001"). When the quotation sheet has an image, the system has to retrieve that image's URL, as described before for other images. The sheet may contain a back link to the transcription we are coming from, and a link to the corresponding facsimile image. == getImageInfo (resource) == - resource (mandatory): base64 encoded resource full URI (collection/context/xml:id) Will extract a portion of XML (tipically an entire DIV tag), from the _img (images) package. The DIV is identified by the given resource (for example "xml://afd/marmi_img/p006pt001pg058"). The sheet contains a back link to its transcription if there's one. This could be tricky, since some ImageInfo sheets doesnt have any, some of them are indeed coming from different context which doesnt have a full transcription (txt) package. They are used, for example, as reuses: the same image used (before or after) in other books. The sheet may contain a link to its facsimile image, found in the same CORRESP attribute of the DIV element. The sheet contains a list of reuses, a list of links to other imageInfo sheets. The system has to extract the image's real URL from the relations db, and use it to display the image. Connected to the image there may be some polygons (by means of points), attached to some key items found in the transcription segment of the image info sheet. These key items describe elements found in the image, like a person or a ship or the light coming from the sky. They are used to connect transcription and image meanings. If there's polygons associated with the current image, the system builds up a proper XML segment to be used as IMT (image mapper tool) input. The system will extract iconclass names found in the image sheet XML, encoded as DCTL:ICONTERM items. The iconclass database contains normalized names for those items. = Additional operations = == Titles == More or less every described method needs to build a string to use as title for the resource it is returning. For the index methods, the system has to build a title for each link element it shows. The procedure to build it iterates over the XML structure, starting from the element we are displaying in the index (always a PB tag), we gather its TYPE (which contains the type of the division, for example "part", "chapter", "section", ..) and N (which contains the number of that element in the context, for example TYPE="chapter" and N="2" means that this division contains the second chapter) attributes. Then we iterate on its parent, gathering its TYPE and its N attributes. Having all of them we can build the full path to that element, using the codified informations: "Chapter 2, Section 2.1, Pag. 23" or "Part 1, Section 1.5, p. 44v", are all built on extracted from the real tags. For those methods which show data from IMG packages, the title is extracted from the rend attribute of the identified container element. It (more or less) contains all the information found in the titles above, but in a different format, and it isnt always accurate. For getTranscription method, the described iterative method is used, using the identified PB as starting point. == Resource URI == As of now, the meaning of resource could or couldnt be considered as a real URI. In some cases it is a full path to the real resoure, in others it isnt. Indeed, in different contexts you can find the same xml:id, thus there can be situations where the system must use the context variable to distinguish which entity we need. Could be useful to refactor the code to make it use full and real URIs all the time, most of the time just by imploding context and resource variables.