Tag

Posts tagged with ‘sparql’

Learning more about SPARQL and Jena internals

kinow @ Apr 28, 2018 18:20:28

O Corvo
O Corvo

Recently a pull request for Apache Jena that I started three years ago got merged. Even though it has been three years since that pull request, there are still many parts of the project code base that I am not familiar with.

And not only the code, but there are also many concepts about SPARQL, other standards used in Jena, and internals about triple stores.

The following list contains some presentations and posts that I am reading right now, while I try to improve my knowledge of SPARQL and Jena internals.

Basic workflow of a SPARQL query in Fuseki

kinow @ Oct 11, 2014 20:36:33

Before using any library or tool in a customer project, specially when it is an Open Source one, there are many things that I like to look at before deploying it. Basically, I look at the features, documentation, community, open issues (in special blockers or criticals), the time to release fixes and new features and, obviously, the license.

At the moment I’m using Apache Jena to work with ontologies, SPARQL and data matching and enrichment for a customer.

Jena is fantastic, and similar tools include Virtuoso, StarDog, GraphDB, 4Store and others. From looking at the code and its community and documentation, Jena seems like a great choice.

I’m still investigating if/how we gonna need to use inference and reasoners, looking at the issues, and learning my way through its code base. The following is my initial mapping of what happens when you submit a SPARQL query to Fuseki.

Fuseki SPARQL query work flow
Jena JDBC

My understanding is that Fuseki is just a web layer, handling a bunch of validations, logging, error handling, and relying on the ARQ module, that is who actually handles the requests. I also think a new Fuseki server is baking in the project git repo, so stay tuned for an updated version of this graph soon.

Happy hacking!

Cypher, Gremlin and SPARQL: Graph dialects

kinow @ Sep 09, 2014 10:14:33

When I was younger and my older brother was living in Germany, I asked him if he had learned German. He said that he did, and explained that there are several dialects, and he was quite proud for some people told him that he was using the Bavarian dialect correctly.

Even though Cypher, Gremlin and SPARQL are all query languages, I think we can consider them dialects of a common graph language. Cypher is the query language used in neo4j, a graph database. Gremlin is part of the Tinkerpop, an open source project that contains graph server, graph algorithms, graph language, among other sub-projects. And last but not least, SPARQL is used to query RDF documents.

Let’s use the example of the Matrix movie provided by neo4j to take a look at the three languages.

Cypher

First we create the graph.

create (matrix1:Movie {id : '603', title : 'The Matrix', year : '1999-03-31'}),
 (matrix2:Movie {id : '604', title : 'The Matrix Reloaded', year : '2003-05-07'}),
 (matrix3:Movie {id : '605', title : 'The Matrix Revolutions', year : '2003-10-27'}),

 (neo:Actor {name:'Keanu Reeves'}),
 (morpheus:Actor {name:'Laurence Fishburne'}),
 (trinity:Actor {name:'Carrie-Anne Moss'}),

 (matrix1)<-[:ACTS_IN {role : 'Neo'}]-(neo),
 (matrix2)<-[:ACTS_IN {role : 'Neo'}]-(neo),
 (matrix3)<-[:ACTS_IN {role : 'Neo'}]-(neo),
 (matrix1)<-[:ACTS_IN {role : 'Morpheus'}]-(morpheus),
 (matrix2)<-[:ACTS_IN {role : 'Morpheus'}]-(morpheus),
 (matrix3)<-[:ACTS_IN {role : 'Morpheus'}]-(morpheus),
 (matrix1)<-[:ACTS_IN {role : 'Trinity'}]-(trinity),
 (matrix2)<-[:ACTS_IN {role : 'Trinity'}]-(trinity),
 (matrix3)<-[:ACTS_IN {role : 'Trinity'}]-(trinity)

Added 6 labels, created 6 nodes, set 21 properties, created 9 relationships, returned 0 rows in 2791 ms

And execute a simple query.

MATCH (a:Actor { name:"Keanu Reeves" })
RETURN a

(9:Actor {name:"Keanu Reeves"})

Gremlin

Again, let’s start by creating our graph.

g = new TinkerGraph();
matrix1 = g.addVertex(["_id":603,"title":"The Matrix", "year": "1999-03-31"]);
matrix2 = g.addVertex(["_id":604,"title":"The Matrix Reloaded", "year": "2003-05-07"]);
matrix3 = g.addVertex(["_id":605,"title":"The Matrix Revolutions", "year": "2003-10-27"]);

neo = g.addVertex(["name": "Keanu Reeves"]);
morpheus = g.addVertex(["name": "Laurence Fishburne"]);
trinity = g.addVertex(["name": "Carrie-Anne Moss"]);

neo.addEdge("actsIn", matrix1); 
neo.addEdge("actsIn", matrix2); 
neo.addEdge("actsIn", matrix3); 
morpheus.addEdge("actsIn", matrix1); 
morpheus.addEdge("actsIn", matrix2); 
morpheus.addEdge("actsIn", matrix3); 
trinity.addEdge("actsIn", matrix1); 
trinity.addEdge("actsIn", matrix2); 
trinity.addEdge("actsIn", matrix3);

And execute a simple query.

g.V.has('name', 'Keanu Reeves').map

gremlin> g.V.has('name', 'Keanu Reeves').map ==>{name=Keanu Reeves} gremlin>

Quite similar to neo4j.

SPARQL

Let’s load our example (thanks to Kendall G. Clark). I used Fuseki to run these queries.

@prefix :          <http://example.org/matrix/> .

 :m1 a :Movie; :title "The Matrix"; :year "1999-03-31".
 :m2 a :Movie; :title "The Matrix Reloaded"; :year "2003-05-07".
 :m3 a :Movie; :title "The Matrix Revolutions"; :year "2003-10-27".

 :neo a :Actor; :name "Keanu Reeves".
 :morpheus a :Actor; :name "Laurence Fishburne".
 :trinity a :Actor; :name "Carrie-Anne Moss".

 :neo :hasRole [:as "Neo"; :in :m1].
 :neo :hasRole [:as "Neo"; :in :m2].
 :neo :hasRole [:as "Neo"; :in :m2].
 :morpheus :hasRole [:as "Morpheus"; :in :m1].
 :morpheus :hasRole [:as "Morpheus"; :in :m2].
 :morpheus :hasRole [:as "Morpheus"; :in :m2].
 :trinity :hasRole [:as "Trinity"; :in :m1].
 :trinity :hasRole [:as "Trinity"; :in :m2].
 :trinity :hasRole [:as "Trinity"; :in :m2].

And finally the SPARQL query.

SELECT ?a WHERE {
   ?a a <http://example.org/matrix/Actor> .
   ?a <http://example.org/matrix/name> ?name .
   FILTER(?name  = "Keanu Reeves")
}

Returning the Keanu Reeves actor instance.

-----------------------------------
| a                               |
===================================
| <http://example.org/matrix/neo> |
-----------------------------------

SPARQL supports inference (or I must say that OWL, RDFS and the reasoners do), but it is easier to define the depth of a search in the graph using neo4j. As for Gremlin, it has native support to Groovy and Java. There is a common denominator for these three languages, but what makes them really powerful are their unique features.

I hope you enjoyed, and that this post gave you a quick overview of some of the existing graph languages. Make sure you ponder the pros and cons of each server/language, and make the best decision for your project. Take a look at other graph query languages too.

Happy hacking!


This post has been updated as suggested by @kendall (Thank you!). You can check the diff at GitHub