SciQL and Data Vault


Scientific applications such as these dealing with satellite images are still poorly served by contemporary relational database systems. At best, the system provides a bridge towards an external library using user-defined functions, explicit import/export facilities or linked-in Java/C# interpreters. Within the TELEIOS project, we have addressed this problem by developing SciQL, a SQL-query language for science applications with arrays as first class citizens. It provides a seamless symbiosis of array-, set-, and sequence- interpretation using a clear separation of the mathematical object from its underlying storage representation.

The language extends value-based grouping in SQL with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between its index attributes. It leads to a generalization of window-based query processing.

The SciQL architecture benefits from a column store system (MonetDB) with an adaptive storage scheme, including keeping multiple representations around for reduced impedance mismatch.

See the paper SciQL, A Query Language for Science Applications and download the demo for more details.

Data Vault

The TELEIOS partner CWI has developed the data vault, a new concept that provides a true symbiosis between a DBMS and existing (remote) file-based repositories. The data vault keeps the data in its original format and place, while at the same time enables transparent (meta)data access and analysis using a query language. Without pressure to change their file-based archives, scientists can now benefit from extended functionality and extensibility. High level declarative query languages facilitate experimentation with novel science algorithms. Scientists can combine their familiar external analysis tools with efficient in-database processing for complex operations, for which databases are traditionally good. Transparent, just-in-time loading of data reduces the start-up cost associated with adopting a database solution for existing file repositories.

The data vault is developed in the context of MonetDB and its scienticfic array-query language SciQL.

See the paper Data Vaults: A Symbiosis between Database Technology and Scientific File Repositories for more details.