Melinda : key extraction tool

Helps you link data by identifying keys in RDF datasets.


RDF keys extraction

Melinda key extraction tool computes subjects identification keys on RDF graphs. The resulting keys (which are sets of discriminant properties) are RDF statements themselves, allowing you to use them in a machine-readable way (empowering linked-data).

Melinda key extraction tool can be used in three different ways : first, you can download the tool and use it like any other command-line application to compute keys of your local datasets. Second, you can choose to use the web based API right in your browser. Third, as the API is RESTful, you can use it without the interface, with a command-line tool like curl for instance, or urllib2 in Python, or anything http-request capable.

Also, there is no format restriction : this means that, basically, the keys can be retrieved, as well for the used dataset, in any RDF format you want, as long as it is supported by the API. Supported formats are : turtle, n3, ntriples, xml, json (-ld).

You can see details on how to use the API further down in this page or by going straight into the documentation section or even start using the API right away.

Examples

1. Browser-based REST API

Right now you are looking at the web interface of the API. It allows you to easily input some RDF dataset from which you want to compute keys. There are two ways of doing so (but for both you have to go to the API / compute section) :

  1. You enter your dataset's URI into the Quick input HTML form and click 'POST'
  2. You choose to input JSON manually using the Manual input tab of the HTML form. A scaffold to help you write proper JSON is available above the form.

Notice that the first way is using the default keys computation parameters. The second way allows you to set your own computation parameters for more specifics results. Informations about all the parameters are here.

You can also browse the existing keys extractions instances by going to the browse section.

2. Non browser-based REST API

The API can be used with any tool or programming language supporting HTTP protocol. Keys extraction ressources are at /keyextraction/. You can do post requests there to compute keys, for instance using curl :

curl -X POST -H "Content-Type: application/json" \
-d '{"dataset": "http://lod.nal.usda.gov/nalt/10486.rdf"}' \
http://rdfpkeys.inrialpes.fr/keyextraction/

This will return you a keys extraction instance, in which you can find the "keys_uri" field that indicates you where the keys are (or will be) located. You can access your keys extraction instance by doing a GET request on /keyextraction/{ID} :

curl http://rdfpkeys.inrialpes.fr/keyextraction/42/

You can input in your raw POST request the same datas as in the manual input HTML form, see the scaffold mentionned above for more informations.

You can as well get the keys themselves by post request on /keyextraction/{ID}/keys/ :

curl http://rdfpkeys.inrialpes.fr/keyextraction/42/keys/

By default everything is serialized using json.

3. Command-line tool

The java command-line tool is available for download here. Download and unzip PseudoKeysExtractor-v*.*.zip then in a terminal inside the folder run :

java -jar pseudo-keys.jar -i /path/to/some/rdf/file

or specify a virtual machine memory allocation parameter (recommended for huge RDF files) :

java -jar pseudo-keys.jar -mx4096m -i path/to/some/rdf/file

For more informations on available options read the help by typing :

java -jar pseudo-keys.jar -h

Going further

This application is fully open-source, meaning that you can copy, modify, and deploy your own version of the RDF keys extraction web API or help us improve this one. If you are interested to do so, please go to the code section for more informations. Hope you enjoy the tool !