Query Designer and SPARQL Editor

From LinDA Wiki
Jump to: navigation, search

Query Designer

Although advanced, GUI-based tools are nowadays important for businesses, data engineers and data experts typically prefer the use of intelligent SQL editors with features like keyword/variable/table/property suggestion, auto-completion and syntax highlighting over GUI query builders. For such users, the use of a quality text-based editor on a database they’re familiar with is far more efficient than any graphical interface. The Query Designer facilitates the need of a GUI to create queries over a specific classes (tables - analogy) of an endpoint (database - analogy), incorporating advanced filters and linking capabilities. Along with the SPARQL intelli-sense editor, these tools aim to cover end-user needs in exploring linked datasets and constructing powerful linked data queries in an easy way.

The interface of the Query Designer includes multiple elements. A datasource selection is used to specify from which endpoint to fetch elements like classes and properties. Note that a single query may contain elements from multiple endpoints to allow interlinking. Instant search over classes and relationships embraces techniques like API pagination and web workers to improve the overall UX (in the show case bellow 370,000+ elements are instantly searched). Ontology preview shows classes and subclasses ordered by their respective size for SPARQL 1.1 endpoints. Instances of different classes with chosen properties can possibly filtered, grouped or ordered by those properties. A property picker helps fetch the properties of the parent instance and allows users to search for properties and add them to the result, while connections between different instances allow interlinking classes from one or more datasources. Other elements not presented here include property filters and advanced query options.

Benefits

By adopting the approach of SQL wizards of popular database management systems, the Query Designer significantly lowers the learning curve of linked data querying. With simple drag n drop functionality, a user is able to perform a simple query without any previous knowledge of either linked data or SPARQL.

The Query Designer takes a drastically different approach from existing visual query builders for the construction of the query. Existing visual query builders prompt users to put nodes and links for the construction of a query. In this context, visual query builders are considered more as a visual aid for constructing the query for experts rather than a query tool for non-linked-data-savvy users. The basic usage workflow of the Query Designer on the other hand is directly targeted to non-expert users.

The tool also minimizes the need to browse / reference the available classes and properties of an endpoint or triple store, and it can also significantly lower the required speed and manual input of a SPARQL query. Also, the out-of-the-box ability to seamlessly interlink multiple SPARQL endpoints promotes the interlinking power of linked data and reveals a substantial advantage in comparison to working with other data models in isolated data silos. This feature is completely transparent to the user, while under the hood, the query designer takes advantage of the Federated Query syntax and auto-generates all the necessary SPARQL code with the appropriate optimizations.

Performance considerations

Real-time, dynamic exploration of public SPARQL endpoints in the web relies on the ability of underlying tools to efficiently interact with and extract useful information from datasets containing hundreds of millions of RDF triples. A straightforward approach of constructing greedy queries, executing them and waiting for results that will be presented to the user is not realistic in the case of a web based tool for linked data queries due to multiple limitations.

Identifying all categories of objects and object-to-object relationships in real world data sources can’t be entirely based on data source metadata (like the ontology been added alongside with the actual data), as such metadata are often out of date or even non-existent. As a result, the queries used to identify classes and properties must be based on the actual data while still returning all or at least the most probably relevant entities in tolerable waiting times, which for information retrieval from the web is approximately 2 seconds [12].

Even if rich ontology metadata allowed the instant or almost-instant retrieval of classes, properties and relationships, filtering and pagination techniques are required in the front-end, as even modern browsers’ performance drops dramatically for documents containing more than a few thousands DOM elements. While entities are loading, the tool remains usable and allows the exploitation of the information already fetched from the endpoint. Additional fallbacks are implemented to support SPARQL 1.0 endpoints that do not support keywords like DISTINCT.

Limitations

The Query Designer currently offers graphical interfaces for a sub-set of SPARQL 1.1 features. SPARQL features currently missing and under implementation include HAVING and IF.

SPARQL intelli-sense editor

The SPARQL editor that provides functionality of a text-based query wizard over linked data. More specifically the SPARQL editor provides code style formatting and intelligent code completion for suggesting SPARQL syntax, namespaces, available endpoints, classes and properties. Based on the ace editor, a SPARQL parsing automaton was implemented to offer syntactical analysis for queries written in the language. Apart from language keywords, the online editor provides suggestions for prefixes, classes and properties based on an open Vocabulary and Metadata Repository [1], as well as explored entities from the queried endpoint whenever possible. The editor also contains large parts and example from the W3C SPARQL 1.1 recommendation to further assist users compose syntactically and logically correct queries.


Benefits

Intelligent code completion allows code authors to improve typing speed, get feedback and suggestion from the editor while limiting the number of trivial mistakes like mismatched operators and typing errors. In comparison to existing editors, the SPARQL intelli-sense editor provides smart and reliable suggestions at any level and expression of the SPARQL query including prefixes, classes and properties. Auto-completion in the SPARQL intelli-sense editor works using the LinDA vocabulary and metadata repository for suggesting prefixes classes and properties, as well as the contents of the queried datasource .