Kashif Rabbani, Matteo Lissandrini, and Katja Hose.
In Companion Proceedings of the Web Conference 2022 (WWW'22 Companion), April 25-29 2022. (PDF), Publisher Link
Knowledge Graphs (KGs) are widely used to represent heterogeneous domain knowledge on the Web and within organizations. Various methods exist to manage KGs and ensure the quality of their data. Among these, the Shapes Constraint Language (SHACL) and the Shapes Expression Language (ShEx) are the two state-of-the-art languages to define validating shapes for KGs. Since the usage of these constraint languages has recently increased, new needs arose. One such need is to enable the efficient generation of these shapes. Yet, since these languages are relatively new, we witness a lack of understanding of how they are effectively employed for existing KGs. Therefore, in this work, we answer How validating shapes are being generated and adopted? Our contribution is threefold. First, we conducted a community survey to analyze the needs of users (both from industry and academia) generating validating shapes. Then, we cross-referenced our results with an extensive survey of the existing tools and their features. Finally, we investigated how existing automatic shape extraction approaches work in practice on real, large KGs. Our analysis shows the need for developing semi-automatic methods that can help users generate shapes from large KGs.
Rabbani, Kashif; Lissandrini, Matteo; and Hose, Katja. SHACL and ShEx in the Wild: A Community Survey on Validating Shapes Generation and Adoption. In Companion Proceedings of the Web Conference 2022 (WWW'22 Companion), April 25-29 2022, Virtual Event, Lyon, France. ACM.
We have used the following datasets:
You can download a copy of these datasets from our single archive.
We have published the extracted SHACL shapes of all three datasets on Zenodo.
Additionally, we have also made available an executable Jar file of our application on Zenodo to extract
SHACL shapes from RDF datasets in .nt
format.