Talia Meeting Four
Data Objects
- API for the source class to access multiple data objects
- Database schema to store the metadata on the data objects (and maybe the data itself)
- Implementing an API or strategy to integrate externally stored data objects
- We'll need the basic source data itself
- The basic class will need to have the APIs to do some operations: search (full text?), checksumming
- Design the REST interface to access the data itself
- Each data object should have a name
- Need to have a datatype for everything that is text data
- Define a way to cope with the encoding problems (data can go beyond UTF-8)
- Need to have a way to handle HTML content (with its own linked objects)
- Need to have a way to handle XML content (with its own linked objects)
- If there are references in the data objects for anything that is local to the same source this should be preserved
- Need to have a datatype for everything that is an PDF
- Need to have a datatype for everything that is an image
- Need to have a datatype for everything that is a video
- Define a separate data object to define the regions inside an image or any other data types for which this makes sense
- Define a separate data object to define the timeframes inside a video or any other data types for which this makes sense
RDF Store
- Implement deletion of triples on the storage using ActiveRDF (expand and use the redland backend, currenlty it deletes every triple only if you have the complete triple definition and sometimes you want to remove all the triples that have one specific element e.g. you wnat to remove all the relations fro a certain author)
- Doing the configuration of redland or the RDF stores from ruby/rails looking into how to install the packages (look at how deprec is built?)
- Making a namespace for Talia
- Make work the cardinality of relations (e.g. Person.friend=afriend vs Person.friend<<afriend) N.B. ActiveRDF considers all the relations as many to many
- How to cope with different character sets?
- Make use of a named RDF graph for each external user or use a centralized annotation service to keep things simple
- Develop a way to handle ordered sets of parts for the path sources
- Develop a way to represent people in the rdf store (people meaning the authors, users, reviewers, ...), using FOAF is hard since there are problems referring the users
- Keep a history of the RDF triples that were added, to allow rebuilding the store
- Add a Relation that refers to microstructures within a Source (Sub-Source)
- Add support for MARC21 format metadata (MarcOnt??)
Workflow
- Build a basic workflow engine/class (Maybe ActsAsStateMachine?) (Model + Implement), maybe an observer on the source class.
- Workflow hooks in the Source class
- Have triggers/events for the workflow
- Build a default default workflow
- Submission procedure: HTML submissions with images etc.
- Submissions: HTML submissions need to be parsed for references etc.
- Submission: Essay as a set of sequential images
REST/Remote interface
- Handle the different character encodings
- Put character encoding into HTTP response header and also into the XML
- Complete REST interface for sources
- Make REST controller work with Talia Source class
- Make REST work with namespaces (for foaf:friend etc. (e.g. foaf-friend or foaf/frined or whatever ;)
- Support multiple values for a predicate (-> person/friend returns many friends)
- Make sensible XML for the source itself
- Add a format for RDF
- REST index/directory pages that give a listing of elements
- "Paginated" responses for really long result lists
I18N
- Integrate I18N framework (e.g. Globalize)
- Check perforcemance issues?
- Integrate hooks for "Translator interface"
Auth
- OpenID integration
- Make a local store for user information and roles
- Database schema for local store
- List the permissions that we need
- Research authorization at model/controller level (Maybe ModelSecurity?)
- Dependency on the OpenID plugin in Talia
- Implement User Roles/Permission (For remote hosts: Accept/Reject/Ask, with Ask as default)
- Implement policy management for remote nodes
- Implement "Limbo" predicates (triples) that are in a state where they are not yet approved (e.g. relations set by a foreign host)
- Implement connection of User account <-> "Person Source"
- Make it possible to store all "Person Sources" a central "directory node" - This should be a configurable option in the "normal" nodes, and all operations using "Persons" should look up the people on that central node -> Needs to be modeled an implemented
- Set up OpenId? server (for the first, with data from HyperNietzsche?)
- Make user information available to model objects (?) - (The session/user information will have to be available if we want to do authorization at model level)
- Authorization in the REST interface
- Implement authentication of Nodes (possibly HTTP Auth) - If a node logs on for the first time, it presents some form of authentication, and is automatically added an internal list. The local admistrator can then allow operations for this node.
- Implement OpenId? for REST
- Research SSL connectig
- Providence information for triples in the store
Testing
- Set up DB+RDF store with real data
- Set up test system with a real large data set (approx. 100.000 Sources, 5 Million triples ?)
- Performance Testing
- Research caching
Migration
- Request documentation/help from Hyper 4 - Documented XML format? Session with Danilo going through the database (!) (List the things we need from the Hyper0 group)
- Import the sample data provided for the partners
- Build migration module for Hyper
- Build migration module for {TEI/whatever the partners use}
Integrity
- Use Checksums from the data class
- Compute Checksum for RDF metadata of the Source - Think about about character sets!
- Compute Checksum for Source itself (Data, Metadata, URL) and sign
- Implement Keyhandling for signing
- Presentation for Rails to Italy (collaborative?)
Notification
- Integrate Email (ActiveMailer??)
- Implement notification framework (for use with different methods)
- Notification by email
- Notification by RSS/Feeds
Internal
- Research if transaction with RDF/DB will work
- Make a possibility to "observe" Source objects, e.g. for sending email on changes
- Find which dublin core elements should be supported by the Talia core system (includes a licensing attribute)
- Add licensing data
- Implement Keywords as wordnet URIs (to have access to vocabulary information)
- Make Microstructure descriptions for documents - Addressing microstructures (as a special Source or in a special format)
- Parts of images
- Parts of ASCII
- Parts of PDF
- Parts of XML/HTML (XPointer)
- Parts of videos
- Automated RDoc generation on the server (Also for ActiveRDF)
- Automatic "build" of the whole system
- Nice display of test results on Cruisecontrol
- RCover on CruiseControl?
- Rails to Italy CC
- Rails to Italy KN Speak (Check podcast)
Data Export
- Export RDF information
- Export data for Juxta
- Export modules for "snippets" (or microformats)
- !vCard
- BibTex
- citation formats
- embedded RDF
Search
- Talk with Oysten about search engines
- Talk with users about search requirements
- Search with boolean operators
- Search on multiple nodes
- Implement Metada/Keyword search
- Implement Search API - OEI-PMH - Library catalogue interface
User interface
- Widget for looking up Sources (on remote Nodes)
- Administration panel for roles, user preferences, permissions
- Faceted search interface
- Translator panel
- UI for "in-place-translations"
- Make Site "Search-Engine friendly"
- Make a logo
- Make a map like http://www.musicovery.com/
Usability & C
- ask the users which character sets are *really* used
- build a way to add annotations to the store
- Hallway usability testing / Both local and on groupware (with drawings)
- Invite users (one from each group) for usability tests (even on mockups)
- Meet "customers" - meetings with people (L&L to meetings)
- Developer Blog for the users?
Download in other formats: