[[PageOutline]] If you want to know how to import the data for the Discovery Sites: DiscoverySiteImport == Talia Data Import == Talia imports data from XML files. There are several ways to get you data into the system: Either you use Talia's simple native XML format, or an existing reader for one of the other supported formats. It's also possible to write your own XML reader to deal with other XML import formats if you need to. == Basic Import Tutorial == You can easily import data in the Talia format. If you want to try it out, you will have to first [TaliaInstallation install and configure] Talia. Then you can clone the demo data from github: {{{ #!sh git clone git://github.com/net7/talia_demo_data.git }}} Or, if you don't have git you can check the [http://github.com/net7/talia_demo_data/downloads downloads page] and download a zip or tarball of the demo data. Once you have, you can import the data via the command line: {{{ #!sh rake talia_core:xml_import xml=talia_demo_data/lucca/demo.xml }}} You should see a progress bar, and once the import is finished the data should be available on your installation's home page. == Advanced importing == All imports from the command line are done using the 'talia_core:xml_import' task. This task accepts a number of options which can be found in the [http://net7.github.com/talia_core/classes/TaliaUtil/ImportJobHelper.html rdoc documentation]. === Using importers for other formats === If you have XML data in other formats, you will need an 'importer' class for that format. Importers are Ruby classes that describe how the XML data structure should be converted to Talia data. To use an existing importer class, you can pass the importer on the command line. Talia will load that class and use it to import the data from the file that you've given: {{{ #!sh rake talia_core:xml_import xml=talia_demo_data/lucca/my_custom_data.xml importer=MyImporterClass }}} === Import sources === The import will both accept file names and web URLs for the data. === Handling of data files === The import data may contain references to data files. These may either be file names, URLs or paths. If paths are found, they will be either interpreted as web paths on the server that the import xml came from, or as file names if the import file came from the file system. Talia will try to automatically detect the MIME type of a data file, using either the file extension or the MIME type supplied by the web server. Depending on the MIME type, Talia may use different kinds of data records or even a custom import routine. For example, if Talia was configured to use the IIP server, image files will automatically converted to pyramid files for IIP. If not, they will just be imported as plain 'ImageData' records. If you need to configure the mapping between the MIME types and the import class/actions, this can be done in an Rails initializer using the [http://net7.github.com/talia_core/classes/TaliaCore/DataTypes/MimeMapping.html MimeMapping class]. === RDF Importers === Talia now provides a mechanism to import RDF data using [http://rdf.rubyforge.org/ RDF.rb]. At the moment you can use the `TaliaCore::ActiveSourceParts::Rdf::NtriplesReader` and the ``TaliaCore::ActiveSourceParts::Rdf::RdfxmlReader``, to import the ntriples or rdf/xml format, respectively. You need to at least install the RDF.rb gem. On its own, it will ''only'' give you ntriples support, but it is the base for everything else: {{{ #!sh gem install rdf }}} '''Note/Dependency:''' The rdf/xml reader uses the the [http://librdf.org/raptor/ raptor] library to parse the format. This ''will'' work fine with JRuby, but you need to have the libraptor library and the rdf-raptor gem installed. ==== Libraptor from source ==== If you are on MacOS, or if your Linux/Unix distribution doesn't come with an installable libraptor (or redland, or raptor) packet, you can [http://librdf.org/raptor/ download the source] and then do the usual {{{ #!sh ./configure make make install }}} '''MacOS Notes''' * The libraptor version from !MacPorts will ''not'' work out of the box, until you explicitly tell the system to load libraries from `/opt/local/libs`. Better to install the lib to the default location using the procedure above. * Jruby must be able to open the library. The fallback solution of using the command line tool will not work with the current rdf-raptor gem; this may be fixed in the future. ==== Libraptor from your distribution ==== Many linux distributions come with an installable libraptor packet, just use that one. The packet might also be called "raptor", or be included in a "redland" package. ==== Raptor gem ==== Just do {{{ #!sh gem install rdf-raptor }}} after you have install the libraptor library.