We will make an introduction to Linked Data by providing a concrete example.

Let’s suppose you are a company and you have your customer list in an excel file (or a database table):

customer list

The file contains information about specific columns (Customer Name, Contract, Birth-date, and Country). This list is probably stored on a specific path in a computer or a shared network path (e.g \\organization\contracts\customers\customerlist.xls).

Let’s now see how this list would be presented as Linked Data.

1.Set a URI for referring to the customers
Set URI
Linked Data starts by defining a URL for an object you want to describe. So in our case:

http://mycompany/customer/

The actual customers (the rows in the excel list) could be accessed as follows:

http://mycompany/customer/<id>

where <id> is the corresponding string or number used to identify the customer.

So for instance,

http://mycompany/customer/c1, is about Aracely Creasman
http://mycompany/customer/c2, is about Nettie Sermons, etc…

As simple as that!

2.Make statements about each customer
Make Statements
Each row in our initial excel file represents a customer. As a second step we take the column values for each row / customer and express them as statements.

For instance for the 2nd row / customer:

Nettie Sermons was born on 2012-5-2

Nettie Sermons has a contract of 3.720.642 EUR

Nettie Sermons is from Mexico

Etc..

Now, let’s a go a step even further. Let’s describe the first customer in the list in the following format.

<http://mycompany/customer/c1> <http://mycompany/customer/Name> “Aracely Creasma” .
<http://mycompany/customer/c1> <http://mycompany/customer/contract> “336358” .
<http://mycompany/customer/c1> <http://mycompany/customer/birthdate> “2012-9-27” .
<http://mycompany/customer/c1> <http://mycompany/customer/country> “Germany” .

For the second customer:

<http://mycompany/customer/c2> <http://mycompany/customer/Name> “Nettie Sermons” .
<http://mycompany/customer/c2> <http://mycompany/customer/contract> “3720642” .
<http://mycompany/customer/c2> <http://mycompany/customer/birthdate> “2012-5-2” .
<http://mycompany/customer/c2> <http://mycompany/customer/country> “Mexico” .

and so on….

So very roughly, we make statements in the following format:

(Customer ID) – (Property of customer) – (Property value) .

Or more in general:

(subject) – (predicate) – (object)

These statement are called triples.

We can see that http://mycompany/customer is repeated too many times. To address this, we can re-write our statement by using prefixes as follows:

@prefix customer: <http://mycompany/customer/> .
customer:c1 customer:Name “Aracel Creasmans” .
customer:c1 customer:contract “3363587” .
customer:c1 customer:birthdate “2012-9-27” .
customer:c1 customer:country “Germany” .
customer:c2 customer:Name “Nettie Sermon” .
customer:c2 customer:contract “3720642” .
customer:c2 customer:birthdate “2012-5-2” .
customer:c2 customer:country “Mexico” .

Generally the overall model that we use to represent our data is called RDF, and the specific format and syntax is called N3.

3.Store these statements in Triple Stores and make Queries
Query
We can now actually take all these statements, store them as plain text (e.g. in a notepad) with a .nt file extension. This plain text can be imported by special databases that understand the RDF / N3 format and are called Triple Stores. Example of a triple store is Virtuoso and Sesame. Those have also pretty flexible open licenses and are easy to get started with.

Once in the triple store you can now make queries on these statements, through a syntax called SPARQL . This is the equivalent to the SQL language of relational databases.

The SPARQL query works more like a pattern matching query. For more information and examples about SPARQL examples, someone can look here).

For example, get me all statements that refer to the first customer:

PREFIX customer: <http://mycompany/customer/>
select ?s ?p ?o where {customer:c1 ?p ?o}

The triple stores can serve their data through a SPARQL endpoint. So basically a SPARQL endpoint is a point of access to the data through a user interface or the http protocol.

4.Use Vocabularies to enhance your data interoperability
Vocabularies
In the above example, the name of the columns of the customer list (Customer Name, Contract ,birthdate, country) were most probably defined by the employees of the company in an arbitrary manner. However, in order to be interoperable with other data documents created within or outside the enterprise, these names should be defined based on popular or standardized vocabularies. For instance to describe a Person you will most probably want to use the FOAF (http://xmlns.com/foaf/spec/) vocabulary that contains properties related to a Person. More vocabularies can be searched in http://linda.epu.ntua.gr/vocabularies/all/. By using vocabularies (FOAF, rdf schema and dbpedia-owl) our above example is re-written as follows:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix dbpedia-owl:<http://dbpedia.org/ontology/> .
@prefix customer: <http://mycompany/customer/> .
customer:c1 rdf:type foaf:Person .
customer:c1 foaf:name "Aracel Creasman”
customer:c1 rdf:type foaf:Person .
customer:c1 dbpedia-owl:birthDate "2012-9-27”
customer:c1 customer:contract "3363587”
customer:c1 dbpedia-owl:country " Germany " .
customer:c2 rdf:type foaf:Person .
customer:c2 foaf:name "Nettie Sermon" .
customer:c2 rdf:type foaf:Person .
customer:c2 dbpedia-owl:birthDate "2012-5-2”
customer:c2 customer:contract "3720642”
customer:c2 customer:country “Mexico”
customer:c2 rdf:type foaf:Person .

Now our customer list is more interoperable with other data documents. For instance, if many enterprises and data providers use the same property (foaf:name) then : a) everybody understands that we are talking about the same thing, b) querying across different sources becomes easy. By querying “get me all foaf:names” you can get the names of people contained in different databases. This wouldn’t be possible if for instance someone had named this property “mycustomername”, or “cname”, or something else arbitrary.

5.Link to external resources – Data Interlinking
Linking
In our example instead of pointing to a literal (“Germany”, “Mexico”) which is basically just a text, we can link it to another resource (e.g. the class “Country” of the World Factbook). In this way our statements would be:

companycustomer:c1 dbpedia-owl:country <http://wifo5-04.informatik.uni-mannheim.de/factbook/resource/Greece> .
companycustomer:c7 companycustomer:country <http://wifo5-04.informatik.uni-mannheim.de/factbook/resource/Mexico> .

By doing this we have now connected the country of our customer to the country concept contained within the World Fact book provider. As such, we can now make instant queries that join these different databases. For instance we can make a query (“give me the GDP of our customer countries”)

6.The power of Linked Data

Linked Data is not the ultimate data model to rule them all and nothing prevents you for mapping your data in more than one data structure in real-time (e.g. relational and linked data). Depending on the actual application other data models / structures may be more suitable and perform much better, Nevertheless, Linked Data is the recommended approach to publish and share data on the web (private or public). The power of Linked Data lies in the following:

  1. Use of URIs to access data: Your data can be immediately accessible by a persistent url in a machine-process able manner. In terms of sharing your data, exposing your data in persistent URIs (http://www.mycompany.com/branches/1) is much more efficient than having the data in an data silo ( e.g. isolated excel file or even a private, commercial database server)
  2. Schema-defying model: Your Linked Data do not follow a specific schema. There are no database tables. There are just a big bag of statements. At any time you can ad-hoc add more statements. Or let another data provider complement your data by adding more statements. For instance you can just provide the triple store dynamically with another triple “customer:c1 foaf:gender “male” to specify the gender of a specific customer. In a traditional database that would require a database schema change.
  3. Use of Vocabularies: Use of popular or standardized vocabularies increases your data interoperability and allows querying across many data repositories.
  4. Interlinking of datasets: The major power of linked data comes with the ability to link your data with external resources. By linking your data, you have the ability to enrich your data ad-hoc with any other valuable information on the shared web. In this sense, the Internet becomes a huge, distributed database of entities. This vision has been realized within the Linked Open Data Cloud

The LinDA project aims to help SMEs and Enterprises harvest this power of Linked Data and minimize the required learning curve of Linked Data. You can find more information about the benefits of Linked Data for SMEs here.