RDF-star has rapidly increased popularity as a way to an-
notate RDF statements while avoiding the disadvantages of reification.
Hence, a number of triplestores supporting this new standard have be-
come available. Yet, it is difficult to assess the performance of each of
the systems and to which degree they support RDF-star and the corre-
sponding SPARQL-star query language. Hence, in this paper, we propose
StarBench, a benchmark for testing SPARQL-star support and runtime
performance. We ran StarBench on a number of state-of-the-art triple-
stores with RDF-star and SPARQL-star support and share our findings.
Based on these findings, we highlight existing challenges and research
opportunities.
STARBENCH QUERIES
The following table provides a detailed explanation of each original query from the REF
benchmark in different categories. It also includes the generated variations of
the query and the modification strategies used to generate them. Additionally,
the table lists the essential SPARQL and SPARQL-star features for each query.
The tables are organized into the query categories. To download the queries, visit our GitHub. The expected results of each query can be found in the downloads section.
Query
Description
SPARQL Features
SPARQL-star Features
P1
Replace the IRI predicate of the asserted triple with a variable
Count
1 embedded graph pattern in subject position
P2
Replace the IRI object of the asserted triple with a variable
Count
1 embedded graph pattern in subject position
P3
Replace the embedded graph pattern with a variable
Count
1 non-embedded graph pattern
P4
Replace all the elements of the query with variables
Count
1 embedded graph pattern in subject position
P5
Replace the object of the embedded triple with an IRI resource
Count
1 embedded graph pattern in subject position
P6
Replace the IRI object of the asserted triple with a variable
-
1 embedded graph pattern in subject position
P7
Add Distinct in the Select clause
Distinct
1 embedded graph pattern in subject position
P8
Add distinct in the Count clause
Count, Distinct
1 embedded graph pattern in subject position
P9
Replace return by count in all variations of the A2 query
Count
1 embedded graph pattern in subject position
P10
Replace the IRI of the subject in the embedded graph pattern with a variable
Count
1 embedded graph pattern in subject position
P11
Replace the IRIs of the subject and object of the embedded graph pattern with variables
Count
1 embedded graph pattern in the subject position
P12
Replace the IRI of the predicate in the asserted triple with a variable
Count
1 embedded graph pattern in subject position
P13
Replace the object of the embedded graph pattern with an IRI
Count
1 embedded graph pattern in subject position
P14
Replace the object of the asserted triple with an IRI
Count
1 embedded graph pattern in object position
P15
Replace return by count in all variations of the A3 query
Count
1 embedded graph pattern in subject position 1 non-embedded graph pattern
P16
Replace return by count in all variations of the B2 query
Count
2 embedded graph patterns in subject position
P17
Remove one of the embedded triple
Count
1 embedded graph pattern in subject position
P18
Remove one of the embedded graph patterns Replace the IRI object of the asserted pattern with a variable
Count
1 embedded graph pattern in subject position
P19
Replace the IRI object of the asserted pattern with a variable
Count
2 embedded graph patterns in the subject position
P20
Replace one of the embedded triple with non-embedded triple
Count
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
P21
Replace return by count in all variations of B3 query
Count
3 embedded graph patterns
P22
Replace an embedded graph pattern in the subject position with a variable
-
2 embedded graph patterns in the subject position 1 non-embedded graph pattern
P23
Remove filter condition
Count
2 embedded graph pattern in the subject position
Query
Description
SPARQL Features
SPARQL-star Features
S1
Replace return by count in all variations of the A4 query and Filter on the object of the embedded pattern
Count
1 embedded graph pattern in subject position 1 non-embedded graph pattern
S2
Count on the object of the embedded graph pattern
Count
1 embedded graph pattern in subject position 1 non-embedded graph pattern
S3
Replace return by count in all variations of the F1 query
Count, 1 Filter
1 embedded graph pattern in the subject position
S4
Apply filter on embedded triple object
Count, 1 Filter
1 embedded graph pattern in the subject position
S5
Replace the IRI subject of the embedded pattern with a variable
Count, 1 Filter
1 embedded graph pattern in the subject position
S6
Apply filter on the object of the embedded triple Replace the IRI subject of the embedded pattern with a variable
Count, 1 Filter
1 embedded graph pattern in the subject position
S7
Replace return by count in all variations of F2 query
Count, 1 Filter
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
S8
Apply filter on the object of the embedded triple
Count, 1 Filter
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
S9
Apply filter on the subject of the embedded triple
Count, 1 Filter
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
S10
Replace return by count in all variations of F3 query
Count, 1 Filter
2 embedded graph pattern in the subject position
S11
Replace the IRI subjects of embedded triples with variables
Count, Filter
2 embedded graph pattern in the subject position
S12
Replace the IRI subject of one of the embedded triples with a variable
Count, 1 Filter
2 embedded graph pattern in the subject position
S13
Apply filter on the object of the embedded triple
Count, 1 Filter
2 embedded graph pattern in the subject position
S14
Apply filter on the subject and object of the embedded triple Replace the IRI subjects of the embedded triples with a variables
Count, 1 Filter
2 embedded graph pattern in the subject position
S15
Replace return by count in all variations of the F4 query
Count, 2 Filters
3 embedded graph pattern in the subject position
S16
Remove one Filter
Count, 1 Filter
3 embedded graph pattern in the subject position
S17
Replace the IRI object of one embedded pattern with a variable
Count, 2 Filters
3 embedded graph pattern in the subject position
S18
Remove one of the filters Replace the IRI object of one embedded pattern with a variable
Count, 1 Filter
3 embedded graph pattern in the subject position
S19
Replace return by count in all variations of the F5 query
Count, 1 Filter
2 embedded graph pattern in the subject position
S20
Replace the IRI subject of one embedded pattern with a variable
Count, 1 Filter
2 embedded graph pattern in the subject position
S21
Replace the IRI subjects of the two embedded patterns with variables
Count, 1 Filter
2 embedded graph pattern in the subject position
S22
Replace the IRI subjects of the two embedded patterns with variables Apply filter on the subjects of the embedded triples
Count, 1 Filter
2 embedded graph pattern in the subject position
Query
Description
SPARQL Features
SPARQL-star Features
C1
Replace the subject of the nested graph pattern with a nested graph pattern
Count
2 embedded graph pattern in subject positions
C2
Replace the object of the non-embedded triple with an embedded triple
Count
1 embedded graph pattern in object position
C3
Group by an element of the embedded pattern
Count, Group By
1 embedded graph pattern in subject position 1 non-embedded graph pattern
C4
Remove the non-embedded graph pattern
Count, Group By
1 embedded graph pattern in the subject position
C5
Remove the non-embedded graph pattern
Count, Group By, Distinct
1 embedded graph pattern in subject position 1 non-embedded graph pattern
C6
Add Distinct in Count clause
Count
1 embedded graph pattern in subject position 1 non-embedded graph pattern
C7
Replace one of the embedded graph patterns with a non-embedded graph pattern Add a union clause between the two graph patterns
Count, Union
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
C8
Add a union clause between the two embedded graph patterns
Count, Union
2 embedded graph pattern in subject position
C9
Add one embedded graph pattern in optional clause
Count, Optional
3 embedded graph patterns
C10
Add one embedded graph pattern in union clause
Count, Union
3 embedded graph patterns
C11
Remove one embedded graph pattern in the subject position Add one embedded graph pattern in the union clause
Count, Union
2 embedded graph patterns
Query
Description
SPARQL Features
SPARQL-star Features
A1
Original query
-
1 embedded graph pattern in subject position
A2
Original query
-
1 embedded graph pattern in subject position
A3
Original query
Count, Group By
1 embedded graph pattern in subject position
A4
Original query
Filter
1 embedded and 1 non-embedded graph patterns
B2
Original query
-
2 embedded graph patterns in subject position
B3
Original query
-
3 embedded graph patterns
F1
Original query
Filter
1 embedded graph pattern in the subject position
F2
Original query
1 Filter
1 embedded graph pattern in the subject position 1 non-embedded graph pattern
F3
Original query
1 Filter
2 embedded graph pattern in the subject position
F4
Original query
2 Filters
3 embedded graph pattern in the subject position
F5
Original query
1 Filter
2 embedded graph pattern in the subject position
GETTING STARTED WITH STARBENCH
For a more complete guide, visit our GitHub. The following guide is based on setting up and running the benchmark over Apache Jena. Other systems will have slightly different commands to load the data and run the queries. Our GitHub repository contains scripts to run the benchmark over the four systems included in the paper (Jena, Oxigraph, Stardog, and GraphDB).
To run StarBench over your own system, you need to:
To run StarBench over Jena, follow the following steps.
Step 1: Download and start the Fuseki server
Download the latest version of Apache Jena and Fuseki and unpack the binaries to your server.
Use the TDB Loader to load the dataset to a TDB database as follows:
Where ${jena-home} is the directory of the Jena installation, ${db-dir} is the directory of the database files, and ${data}.ttls is the dataset TTLS file.
Afterwards, start the Fuseki server using the following command:
Where ${query} is the SPARQL-star query to execute and ${ip} is the IP address of the server.
To ease running the queries over the server loaded above, you can use the provided client-side script for Jena as follows:
bash scripts/client/Jena ${queries} ${ip}
Where ${queries} is the directory containing the queries. Similar scripts have been provided for the other engines included in our experiments.
EXPERIMENTS
We tested four different interfaces that support SPARQL-star over StarBench:
Apache Jena Version 4.7.0 with TDB version 2: We used the default configuration for the TDB loader. We used Fuseki for the server to execute the queries.
Stardog Version 8.1.0, Default configurations except for the query timeout, we changed it from 3 seconds to 30 minutes (the timeout threshold for all the queries in our experiments).
Oxigraph Version 0.3.16 using the project’s docker image and curl to execute the queries.
GraphDB Version 10.0.2, installed as a standalone server and curl to execute the encoded queries.
AnzoGraph Version 2.5.16, using the project’s docker image and curl to execute the queries.
BlazeGraph Version 2.1.6, as a standalone Jetty server and curl to execute the queries.
AnzoGraph and BlazeGraph were not able to load the datasets, and have therefore been omitted from our experiments.
AnzoGraph threw a "too many properties" error, and BlazeGraph a Parsing error.
We ran our experiments on a server with 256GB RAM, 16 cores (AMD 7281), 240GB SSD, and 8TB of HDD. A timeout threshold of 30 minutes was established.
Below, the results can be seen over each query category.