StarBench

On Benchmarking RDF-star Triplestores with StarBench

Abstract

RDF-star has rapidly increased popularity as a way to an- notate RDF statements while avoiding the disadvantages of reification. Hence, a number of triplestores supporting this new standard have be- come available. Yet, it is difficult to assess the performance of each of the systems and to which degree they support RDF-star and the corre- sponding SPARQL-star query language. Hence, in this paper, we propose StarBench, a benchmark for testing SPARQL-star support and runtime performance. We ran StarBench on a number of state-of-the-art triple- stores with RDF-star and SPARQL-star support and share our findings. Based on these findings, we highlight existing challenges and research opportunities.

STARBENCH QUERIES

The following table provides a detailed explanation of each original query from the REF benchmark in different categories. It also includes the generated variations of the query and the modification strategies used to generate them. Additionally, the table lists the essential SPARQL and SPARQL-star features for each query.

The tables are organized into the query categories.
To download the queries, visit our GitHub.
The expected results of each query can be found in the downloads section.
Query Description SPARQL Features SPARQL-star Features
P1 Replace the IRI predicate of the asserted triple with a variable Count 1 embedded graph pattern in subject position
P2 Replace the IRI object of the asserted triple with a variable Count 1 embedded graph pattern in subject position
P3 Replace the embedded graph pattern with a variable Count 1 non-embedded graph pattern
P4 Replace all the elements of the query with variables Count 1 embedded graph pattern in subject position
P5 Replace the object of the embedded triple with an IRI resource Count 1 embedded graph pattern in subject position
P6 Replace the IRI object of the asserted triple with a variable - 1 embedded graph pattern in subject position
P7 Add Distinct in the Select clause Distinct 1 embedded graph pattern in subject position
P8 Add distinct in the Count clause Count, Distinct 1 embedded graph pattern in subject position
P9 Replace return by count in all variations of the A2 query Count 1 embedded graph pattern in subject position
P10 Replace the IRI of the subject in the embedded graph pattern with a variable Count 1 embedded graph pattern in subject position
P11 Replace the IRIs of the subject and object of the embedded graph pattern with variables Count 1 embedded graph pattern in the subject position
P12 Replace the IRI of the predicate in the asserted triple with a variable Count 1 embedded graph pattern in subject position
P13 Replace the object of the embedded graph pattern with an IRI Count 1 embedded graph pattern in subject position
P14 Replace the object of the asserted triple with an IRI Count 1 embedded graph pattern in object position
P15 Replace return by count in all variations of the A3 query Count 1 embedded graph pattern in subject position
1 non-embedded graph pattern
P16 Replace return by count in all variations of the B2 query Count 2 embedded graph patterns in subject position
P17 Remove one of the embedded triple Count 1 embedded graph pattern in subject position
P18 Remove one of the embedded graph patterns
Replace the IRI object of the asserted pattern with a variable
Count 1 embedded graph pattern in subject position
P19 Replace the IRI object of the asserted pattern with a variable Count 2 embedded graph patterns in the subject position
P20 Replace one of the embedded triple with non-embedded triple Count 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
P21 Replace return by count in all variations of B3 query Count 3 embedded graph patterns
P22 Replace an embedded graph pattern in the subject position with a variable - 2 embedded graph patterns in the subject position
1 non-embedded graph pattern
P23 Remove filter condition Count 2 embedded graph pattern in the subject position
Query Description SPARQL Features SPARQL-star Features
S1 Replace return by count in all variations of the A4 query and Filter on the object of the embedded pattern Count 1 embedded graph pattern in subject position
1 non-embedded graph pattern
S2 Count on the object of the embedded graph pattern Count 1 embedded graph pattern in subject position
1 non-embedded graph pattern
S3 Replace return by count in all variations of the F1 query Count, 1 Filter 1 embedded graph pattern in the subject position
S4 Apply filter on embedded triple object Count, 1 Filter 1 embedded graph pattern in the subject position
S5 Replace the IRI subject of the embedded pattern with a variable Count, 1 Filter 1 embedded graph pattern in the subject position
S6 Apply filter on the object of the embedded triple
Replace the IRI subject of the embedded pattern with a variable
Count, 1 Filter 1 embedded graph pattern in the subject position
S7 Replace return by count in all variations of F2 query Count, 1 Filter 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
S8 Apply filter on the object of the embedded triple Count, 1 Filter 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
S9 Apply filter on the subject of the embedded triple Count, 1 Filter 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
S10 Replace return by count in all variations of F3 query Count, 1 Filter 2 embedded graph pattern in the subject position
S11 Replace the IRI subjects of embedded triples with variables Count, Filter 2 embedded graph pattern in the subject position
S12 Replace the IRI subject of one of the embedded triples with a variable Count, 1 Filter 2 embedded graph pattern in the subject position
S13 Apply filter on the object of the embedded triple Count, 1 Filter 2 embedded graph pattern in the subject position
S14 Apply filter on the subject and object of the embedded triple
Replace the IRI subjects of the embedded triples with a variables
Count, 1 Filter 2 embedded graph pattern in the subject position
S15 Replace return by count in all variations of the F4 query Count, 2 Filters 3 embedded graph pattern in the subject position
S16 Remove one Filter Count, 1 Filter 3 embedded graph pattern in the subject position
S17 Replace the IRI object of one embedded pattern with a variable Count, 2 Filters 3 embedded graph pattern in the subject position
S18 Remove one of the filters
Replace the IRI object of one embedded pattern with a variable
Count, 1 Filter 3 embedded graph pattern in the subject position
S19 Replace return by count in all variations of the F5 query Count, 1 Filter 2 embedded graph pattern in the subject position
S20 Replace the IRI subject of one embedded pattern with a variable Count, 1 Filter 2 embedded graph pattern in the subject position
S21 Replace the IRI subjects of the two embedded patterns with variables Count, 1 Filter 2 embedded graph pattern in the subject position
S22 Replace the IRI subjects of the two embedded patterns with variables
Apply filter on the subjects of the embedded triples
Count, 1 Filter 2 embedded graph pattern in the subject position
Query Description SPARQL Features SPARQL-star Features
C1 Replace the subject of the nested graph pattern with a nested graph pattern Count 2 embedded graph pattern in subject positions
C2 Replace the object of the non-embedded triple with an embedded triple Count 1 embedded graph pattern in object position
C3 Group by an element of the embedded pattern Count, Group By 1 embedded graph pattern in subject position
1 non-embedded graph pattern
C4 Remove the non-embedded graph pattern Count, Group By 1 embedded graph pattern in the subject position
C5 Remove the non-embedded graph pattern Count, Group By, Distinct 1 embedded graph pattern in subject position
1 non-embedded graph pattern
C6 Add Distinct in Count clause Count 1 embedded graph pattern in subject position
1 non-embedded graph pattern
C7 Replace one of the embedded graph patterns with a non-embedded graph pattern
Add a union clause between the two graph patterns
Count, Union 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
C8 Add a union clause between the two embedded graph patterns Count, Union 2 embedded graph pattern in subject position
C9 Add one embedded graph pattern in optional clause Count, Optional 3 embedded graph patterns
C10 Add one embedded graph pattern in union clause Count, Union 3 embedded graph patterns
C11 Remove one embedded graph pattern in the subject position
Add one embedded graph pattern in the union clause
Count, Union 2 embedded graph patterns
Query Description SPARQL Features SPARQL-star Features
A1 Original query - 1 embedded graph pattern in subject position
A2 Original query - 1 embedded graph pattern in subject position
A3 Original query Count, Group By 1 embedded graph pattern in subject position
A4 Original query Filter 1 embedded and 1 non-embedded graph patterns
B2 Original query - 2 embedded graph patterns in subject position
B3 Original query - 3 embedded graph patterns
F1 Original query Filter 1 embedded graph pattern in the subject position
F2 Original query 1 Filter 1 embedded graph pattern in the subject position
1 non-embedded graph pattern
F3 Original query 1 Filter 2 embedded graph pattern in the subject position
F4 Original query 2 Filters 3 embedded graph pattern in the subject position
F5 Original query 1 Filter 2 embedded graph pattern in the subject position

GETTING STARTED WITH STARBENCH

For a more complete guide, visit our GitHub. The following guide is based on setting up and running the benchmark over Apache Jena. Other systems will have slightly different commands to load the data and run the queries. Our GitHub repository contains scripts to run the benchmark over the four systems included in the paper (Jena, Oxigraph, Stardog, and GraphDB).

To run StarBench over your own system, you need to:
To run StarBench over Jena, follow the following steps.

Step 1: Download and start the Fuseki server
Download the latest version of Apache Jena and Fuseki and unpack the binaries to your server.
Use the TDB Loader to load the dataset to a TDB database as follows:
        ${jena-home}/bin/tdb2.tdbloader --loc ${db-dir} ${data}.ttls
      
Where ${jena-home} is the directory of the Jena installation, ${db-dir} is the directory of the database files, and ${data}.ttls is the dataset TTLS file.
Afterwards, start the Fuseki server using the following command:
        java -jar ${fuseki-home}/fuseki-server.jar --loc=${db-dir} /rdfstar
      
Where ${jena-home} is the directory of the Fuseki installation.

To ease loading the data and running the server, you can use the provided server-side script for Jena as follows:
        bash scripts/server/Jena ${jena-home} ${fuseki-home} ${db-dir} ${data}
      
Similar scripts have been provided for the other engines included in our experiments.

Step 2: Run the StarBench queries from the client machine
On the client machine: Download the queries. To execute a query over the Fuseki endpoint, run the following command:
        curl --silent -X POST -H "Accept: application/sparql-results+json" -H "Content-Type: application/sparql-query" \
            --data "${query}" http://${ip}:3030/rdfstar/query
      
Where ${query} is the SPARQL-star query to execute and ${ip} is the IP address of the server.

To ease running the queries over the server loaded above, you can use the provided client-side script for Jena as follows:
        bash scripts/client/Jena ${queries} ${ip}
      
Where ${queries} is the directory containing the queries.
Similar scripts have been provided for the other engines included in our experiments.

EXPERIMENTS

We tested four different interfaces that support SPARQL-star over StarBench: AnzoGraph and BlazeGraph were not able to load the datasets, and have therefore been omitted from our experiments. AnzoGraph threw a "too many properties" error, and BlazeGraph a Parsing error.

We ran our experiments on a server with 256GB RAM, 16 cores (AMD 7281), 240GB SSD, and 8TB of HDD. A timeout threshold of 30 minutes was established.
Below, the results can be seen over each query category.

P1-P12





P13-P23





S1-S11





S12-S22





C1-C11









C1-C11





DOWNLOADS