Geospatial Extensions for RDF and SPARQL

The data model stRDF and the query language stSPARQL are extensions of the standards RDF and SPARQL 1.1 respectively for representing and querying geospatial data that changes over time.

stRDF and stSPARQL use the widely adopted OGC standards Well Known Text (WKT) and Geography Markup Language (GML) to represent geospatial data. We achieved this by introducing the datatypes strdf:WKT and strdf:GML for the representation of geometries encoded in WKT and GML respectively. We also defined the datatype strdf:geometry as the union of the datatypes strdf:WKT and strdf:GML. The temporal dimension of stRDF and stSPARQL assumes a discrete time line and uses the value space of the datatype xsd:dateTime of XML-Schema to model time. Two kinds of time primitives are supported: time instants and time periods. Time instants are represented using literals of the xsd:dateTime datatype. Time periods are represented by literals of the new datatype strdf:period that we introduce in stRDF. Values of the datatype strdf:period can be used as objects of a triple to represent user-defined time. In addition, they can be used to represent valid times of temporal triples. A temporal triple (quad) is an expression of the form s p o t. where s p o. is an RDF triple and t is a time instant or a time period called the valid time of a triple. An stRDF graph is a set of triples and temporal triples. We also assume the existence of temporal constants NOW and UC inspired from the literature of temporal databases. NOW represents the current time and can appear in the beginning or the ending point of a period. UC means “Until Changed" and is used for introducing valid times of a triple that persist until they are explicitly terminated by an update.

The query language stSPARQL extends SPARQL 1.1 with functions that take as arguments spatial or temporal terms and can be used in the SELECT, FILTER, and HAVING clause of a SPARQL 1.1 query. We use functions from the “OpenGIS Simple Feature Access - Part 2: SQL Option” standard (OGC-SFA) for querying stRDF data. This standard defines relational schemata that support the storage, retrieval, query and update of sets of simple features using SQL. stSPARQL extends SPARQL 1.1 with the machinery of the OGC-SFA standard. We achieve this by defining a URI for each of the SQL functions defined in the standard and use them in SPARQL queries. Similarly, we have defined a Boolean SPARQL extension function for each topological relation defined in OGC-SFA (topological relations for simple features), Egenhofer relations and RCC-8 relations. In this way stSPARQL supports multiple families of topological relations our users might be familiar with. Using these functions stSPARQL can express spatial selections and spatial joins. The stSPARQL extension functions can also be used in the SELECT clause of a SPARQL query. As a result, new spatial literals can be generated on the fly during query time based on pre-existing spatial literals. Update operations are also supported in stSPARQL conforming to the W3C standard SPARQL Update 1.1.

The query language stSPARQL is also enabled with valid time support as follows. First, temporal triple patterns are introduced as the most basic way of querying temporal triples. A temporal triple pattern is an expression of the form s p o t., where s p o. is a triple pattern and t is a time period or a variable. Second, Temporal extension functions are defined in order to express temporal relations between expressions that evaluate values of the datatypes xsd:dateTime and strdf:period. The first set of such temporal functions are 13 Boolean functions that correspond to the 13 binary relations of Allen's Interval Algebra. stSPARQL offers nine functions that are “syntactic sugar”, i.e., they encode frequently-used disjunctions of these relations. stSPARQL also defines functions that allow relating an instant with a period and functions that construct new (closed open) periods from existing ones, as well as temporal aggregates.

A complete reference of the spatial and temporal extension functions of stSPARQL is available at http://www.strabon.di.uoa.gr/stSPARQL.

stSPARQL and the recent OGC standard GeoSPARQL has been developed independently at about the same time, and have concluded with very similar representational and querying constructs. Both approaches represent geometries as literals of an appropriate datatype which may be encoded in various formats like GML, WKT etc. Both approaches map spatial predicates and functions that support spatial analysis to SPARQL extension functions. GeoSPARQL goes beyond stSPARQL in that it allows binary topological relations to be used as RDF properties anticipating their possible utilization by spatial reasoners (this is the topological extension and the related query rewrite extension of GeoSPARQL). In our group, such geospatial reasoning functionality has been studied in the more general context of “incomplete information in RDF”. Since stSPARQL has been defined as an extension of SPARQL 1.1, it differs from GeoSPARQL as follows. First, it offers geospatial aggregate functions and update statements that have not been considered at all by GeoSPARQL. Second, GeoSPARQL imposes an RDFS ontology for the representation of features and geometries. On the contrary, stRDF only asks that a specific literal datatype is used and leaves the responsibility of developing any ontology to the users. Finally, stSPARQL offers the capability to query the valid time dimension of triples as well as a wide set of temporal operations, while GeoSPARQL does not deal with time at all.

See the papers Strabon: A Semantic Geospatial DBMS and Representing and Querying the valid time of triples for Linked Geospatial Data for more details.