ColChain

Collaborative Linked Data Networks

A Demonstration of ColChain

Collaborative Knowledge Chains

Abstracts

Research paper

One of the major obstacles that currently prevents the Semantic Web from exploiting its full potential is that the data it provides access to is sometimes not available or outdated. The reason is rooted deep within its architecture that relies on data providers to keep the data available, queryable, and up-to-date at all times - an expectation that many data providers in reality cannot live up to for an extended (or infinite) period of time. Hence, decentralized architectures have recently been proposed that use replication to keep the data available in case the data provider fails. Although this increases availability, it does not help keeping the data up-to-date or allow users to query and access previous versions of a dataset. In this paper, we therefore propose ColChain (COLlaborative knowledge CHAINs), a novel decentralized architecture based on blockchains that not only lowers the burden for the data providers but at the same time also allows users to propose updates to faulty or outdated data, trace updates back to their origin, and query older versions of the data. Our extensive experiments show that ColChain reaches these goals while achieving query processing performance comparable to the state of the art.

Demo paper

The current architecture of the Semantic Web fully relies on the individual data providers to maintain access to their data and to keep their data up to date. While this may seem like a practical and straightforward solution, it often results in the data being unavailable or outdated. In this paper, we present a fully functioning client along with a user-friendly interface for ColChain, a system that increases availability of knowledge graphs and enables users to update the data in a community-driven way while still allowing them to query old versions.

VIDEO DEMONSTRATION


CONFERENCE PRESENTATION

EXPERIMENTS

We ran our experiments on a server with 128 vCPU cores at 2.5GHz, 64KB L1 cache, 512KB L2 cache and 8192KB L3 cache each, and 2TB RAM. We use 128 nodes on the same server with resources split up evenly between them.

We have the following metrics: We use LargeRDFBench for data and queries for tests. We use groups S, C, L, and CH.

Query Execution Time (QET) in milliseconds for group S over different systems (log scale)

Query Execution Time (QET) in milliseconds for group S over different community sizes (log scale)

Query Response Time (QRT) in milliseconds for all query groups over different systems (log scale)

Number of Exchanged Messages (NEM) in milliseconds for all query groups over different systems (log scale)

Number of Transferred Bytes (NTB) in milliseconds for all query groups over different systems (log scale)

Update Overhead Time (UOT) in milliseconds for varying community sizes over different sized updates updates (log scale)

Query Execution Time (QET) in milliseconds for all query groups over roll-backs of chains of different lengths (log scale)

Verison Materialization Time (VMT) in milliseconds for all query groups over roll-backs of chains of different lengths (log scale)

INSTALLATION

Requirements

Java 8 or newer, Jetty server.

Installation

To install a ColChain or run experiments with ColChain, please follow the instructions on our GitHub.

DOWNLOADS

  • Download the executables or view the sources from our GitHub.
  • Download the HDT files used in our experiments.
  • Download the PPBF index files used in our experiments.
  • Download the setup files used in our experiments.
  • Download the queries used in our experiments.