A back arrow icon.
RDFox Blog

Reasoning with Northwind

Reasoning with Northwind
Marcelo Barbieri

Reasoning is probably the most powerful feature and main selling point of RDF Graph Databases.

RDF Graph Databases with advanced reasoning capabilities are believed to be the future of AI, as Mike Tung put in his Forbes article Knowledge Graphs Will Lead To Trustworthy AI:

The era of black-box AI systems is over. Next-generation systems will optimize the explainability and trustworthiness of the overall human-AI system, and knowledge graphs will serve as a key ingredient that makes these systems more explainable, inspectable, auditable and, ultimately, controllable.

What’s Semantic Reasoning?

Semantic reasoning is the ability of a system to infer new facts from existing data based on inference rules or ontologies. In simple terms, rules add new information to the existing dataset, adding context, knowledge, and valuable insights
Oxford Semantic Technologies

The Northwind Article Series

This is the first of a series of articles on reasoning with Northwind sample database. In this article we are going to create inference rules to simplify and optimise queries and data management.

We are going to use RDFox, an in-memory high performance knowledge graph and semantic reasoning engine. RDFox uses Datalog rule language to express rules.

It’s a learning by example experience and not much theory will be covered here.

For more details on RDFox Reasoning, Datalog, and Rules, as well as the Northwind sample database, please refer to the links in the “References” section in the end of this article.

If you choose to set up the environment in order to execute the queries yourself, please refer to the “Setting up the demo environmet” section further down in this article. Otherwise, you can just browse the queries and screenshots below.

Rule Examples

Rules define conditions to be matched in the data in order to infer new triples that become available to queries. They provide a mechanism that allows tailor-made performance improvements to specific queries.

In this section we are going introduce three practical examples (use cases) to explain how rules work.

Each use case will contain an original query, a rule and a modified version of the query that uses the rule, producting the same result.

Use case 01 — list customers who bought a product

Original query

The original SPARQL query used to return a list of customers who bought product-61.

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Customers who bought product-61

SELECT DISTINCT # eliminates duplicates in case the same customer bought a product more than once
 ?customer
 ?companyName
 ?contactName
WHERE {
 GRAPH kggraph:dataGraph {
   ?customer a :Customer ;
       :companyName ?companyName ;
       :contactName ?contactName .
   ?order a :Order ;
       :hasCustomer ?customer .
   ?orderDetail a :OrderDetail ;
       :hasProduct :product-61 ;
       :belongsToOrder ?order .
 }
}
ORDER BY ?customer

Original query result:

RDFox web console — original query 01 result

Note: Query above completed in 12ms and screenshot displays 4 out of 22 results.

By using Property Path we can easily demonstrate the path that needs to be traversed to answer the question.

# Path: customer → order → orderDetail → product
PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

SELECT DISTINCT
   ?customer
WHERE {
   GRAPH ?graph {
       ?customer ^:hasCustomer/^:belongsToOrder/:hasProduct :product-61 .
   }
}
ORDER BY ?customer
RDFox web console — instance example — customer-BLAUS bought product-61

That’s quite a long way to answer such a typical question. We want to create a shortcut, which will not only speed up things but also make the query more intuitive and easier to maintain. This is archived by rule 01 below.

Rule 01 — boughtProduct

A rule that defines which product was bought by a customer.

Rule definition

PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

[?customer, :boughtProduct, ?product] :-
   [?customer, a, :Customer],
   [?order, a, :Order],
   [?orderDetail, a, :OrderDetail],
   [?product, a, :Product],
   [?orderDetail, :hasProduct, ?product],
   [?orderDetail, :belongsToOrder, ?order],
   [?order, :hasCustomer, ?customer] .

Add rule to the data store

There are many ways of adding rules to an RDFox data store. The following example uses curl through a REST API.

curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/01-customer-bought-product.dlog" "localhost:12110/datastores/Northwind/content"

Note that the destination named graph for the rule is specified in the curl command.

For those who set up the environment with RDFox running in a Docker container, the required authentication will need to be added to the curl command: -u admin:admin

Modified query

The original query was modified to consume the new rule we have just created. The modified query produces the same result.

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Customers who bought product-61

SELECT
 ?customer
 ?companyName
 ?contactName
WHERE {
 GRAPH kggraph:dataGraph {
   ?customer a :Customer ;
       :boughtProduct :product-61 ;
       :companyName ?companyName ;
       :contactName ?contactName .
 }
}
ORDER BY ?customer

Modified query result:

RDFox web console — modified query 01 result

Note: Query above completed in 8ms and screenshot displays 4 out of 22 results.

Since version 5.6, it’s possible to highlight reasoning on the RDFox web console. The following shows the new derived fact, which is materialised in RDFox as a new triple in the graph.

RDFox web console — modified query 01 result — highlighted reasoning

Use case 02 — top 5 customers by product count

Original query

Lists the top 5 customers by product count.

PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Top 5 customers by product count

SELECT  
 ?customer
 ?companyName
 ?contactName
 (COUNT(?product) as ?count)
WHERE {
 GRAPH kggraph:dataGraph {
   ?orderDetail :hasProduct ?product ;
       :belongsToOrder ?order .
   ?order :hasCustomer ?customer .
   ?customer :companyName ?companyName ;
       :contactName ?contactName .
 }
}
GROUP BY ?customer ?companyName ?contactName
ORDER BY DESC(?count)
LIMIT 5

Original query Result:

RDFox web console — original query 02 result

Note: query above executed in 12ms.

Create Rule 02 — hasProductCount

The following rule defines relations based on the result of an aggregate calculation.

Rule definition

PREFIX        : <http://www.mysparql.com/resource/northwind/>

[?customer, :hasProductCount, ?productCount] :-
AGGREGATE (
   [?customer, a, :Customer],
   [?order, a, :Order],
   [?orderDetail, a, :OrderDetail],
   [?product, a, :Product],
   [?orderDetail, :hasProduct, ?product],
   [?orderDetail, :belongsToOrder, ?order],
   [?order, :hasCustomer, ?customer]
   ON ?customer
   BIND COUNT(?product) AS ?productCount
) .

Add rule to the data store

curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/02-customer-has-product-count.dlog" "localhost:12110/datastores/Northwind/content"

Modified query

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Top 5 customers by product count

SELECT
 ?customer
 ?companyName
 ?contactName
 ?productCount
WHERE {
 GRAPH kggraph:dataGraph {
   ?customer :hasProductCount ?productCount ;
       :companyName ?companyName ;
       :contactName ?contactName .
 }
}
ORDER BY DESC(?productCount)
LIMIT 5

Modified query result:

RDFox web console — modified query 02 result

Note: It’s very important to define the types and make rules as selective as possible to improve rule materialisation and query answering times. For example, adding the types [?customer, a, :Customer], [?order, a, :Order] and [?orderDetail, a, OrderDetail] to the previous rule brought query execution time from 10 down to 3ms. More guidelines on how to create rules can be found in The Do’s and Don’ts of Rule and Query Writing article.

The following illustration highlights the inferred facts (in cyan) as a result of rules 01 and 02 .

RDFox web console — modified query 02 result — highlighted reasoning

Use Case 03 — customers who never placed an order

Original query

Lists the customers who never placed an order.

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Customers who never placed an order

SELECT DISTINCT
 ?customer
 ?companyName
 ?postalCode
 ?city
 ?country
WHERE {
   GRAPH ?graph {
       ?customer a :Customer ;
           :customerID ?customerID ;
           :companyName ?companyName ;
           :city ?city ;
           :country ?country .
       OPTIONAL { ?customer :postalCode ?postalCode } .
       OPTIONAL {
           ?order a  :Order .
           ?customer ^:hasCustomer ?order .
       }
       FILTER (!BOUND(?order))
   }
}
ORDER BY ?customer

Original query result:

RDFox web console — original query 03 result

The SPARQL query above can be re-written using MINUS or FILTER NOT EXISTS, producing the same result. For the differences on how these commands get evaluated, please refer to the comments in the file queries/03–1-customers-who-never-placed-an-order-before-rule-03.sparql from the demo github repo.

Create Rule 03 — CustomerWithoutOrder

Negation as failure is a very powerful feature of rules in RDFox.

Rule definition

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

[?customer, a, :CustomerWithoutOrder] :-
   [?customer, a, :Customer], # All customers
   NOT EXISTS ?order IN (
     [?order, a, :Order],
     [?order, :hasCustomer, ?customer] # Only customers who placed orders
   ) .

Add rule to the data store

curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/03-customer-without-order.dlog" "localhost:12110/datastores/Northwind/content"

Modified query

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Customers who never placed an order

SELECT DISTINCT
 ?customer
 ?companyName
 ?postalCode
 ?city
 ?country
WHERE {
   GRAPH ?graph {
       ?customer a :CustomerWithoutOrder ;
           :customerID ?customerID ;
           :companyName ?companyName ;
           :city ?city ;
           :country ?country .
       OPTIONAL {?customer :postalCode ?postalCode} .
   }
}
ORDER BY ?customer

Modified query result:

RDFox web console — modified query 03 result

The following illustration highlights the derived facts (in cyan) as a result of rule 03.

RDFox web console — modified query 03 result — highlighted reasoning

And, finally, the following highlights (in cyan) the derived facts as a result of all previous rules created so far.

RDFox web console — highlighted reasoning with derived facts from rules 01, 02, and 03

Let’s see what happens if a CustomerWithoutOrder places an order.

PREFIX        : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>

# Add an order to a customer :customer-FISSA

INSERT DATA {
   GRAPH kggraph:dataGraph {
       :order-99999 a :Order ;
           :hasCustomer :customer-FISSA .
       :orderDetail-99999-61 a :OrderDetail ;
           :hasProduct :product-61 ;
           :belongsToOrder :order-99999 .
   }
}

When we execute the modified query a second time, :customer-FISSA is not returned. That’s because the derived fact CustomerWithoutOrder was retracted when that customer placed an order.

RDFox web console — modified query 03 after :customer-FISSA placed an order

And, what if we delete rule 03 from the Northwind data store altogether?

curl -X PATCH -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/03-customer-without-order.dlog" "localhost:12110/datastores/Northwind/content?operation=delete-content"

Then, query 03 will not produce any results.

RDFox web console — modified query 03 after deleting rule 03

How Do Rules Work In RDFox?

  • RDFox uses parallel reasoning and does incremental materialisation of inferred triples.
  • Data updates will cause new materialised triples to be derived as a logical consequence or retracted when they are no longer justified. This happens automatically when adding or deleting facts or rules, and it’s done in an incremental and very efficient fashion.
  • RDFox allows us to add inferred triples to Named Graphs other than the Default Graph.

The above are considered to the most desirable features of an advanced reasoning engine.

Conclusion

We started our journey with a simple demonstration on how inference rules can enrich an existing triplestore. We are planning to extend the reasoning capabilities of the Northwind sample database by adding axioms, an ontology and additional rules to answer more complex questions. Stay tuned!

Setting Up The Demo Environment

If you choose to run the queries in this demonstration, please follow the steps below to set up the demo environment.

Clone the Northwind Repository

The following github repository contains the sample data, queries and rules used in this demonstration.

IMPORTANT! If you are on MacOS, you may choose to follow the instructions in the git repo above and skip the remaining steps in this section. The repo will start a persisted instance of RDFox in a Docker container with the Northwind data store already loaded and configured.
By using this option, the only thing that changes for you when executing the steps in the demo is the curl commands to add rules. You will need to append the authentication -u admin:admin before executing them.

Download RDFox

Request an RDFox license here. You will need a commercial or academic email.

Download the appropriate version of RDFox onto your machine.

Copy the license file RDFox.lic to the directory where the RDFox executable is located.

Launch RDFox

In a terminal, from the same directory above, execute ./RDFox sandbox on MacOS/Linux or RDFox.exe sandbox on Windows to launch RDFox.

MacOS Only

If you get a warning message saying that RDFox is not from an identified developer, click Cancel.

MacOS warning message

Go to System Preferences > Security and Privacy > General Tab and then click on Allow Anyway, as illustrated below and run the sandbox command again.

MacOS Security & Privacy

If you get another warning message, choose Open to start the RDFox shell.

MacOS warning message

If everything goes fine, you should get the following message in the terminal:

A new server connection was opened as role ‘guest’ and stored with name ‘sc1’.

Expose RDFox REST API

In the Shell, execute the following to expose the RDFox REST API, which includes a SPARQL over HTTP endpoint.

endpoint start

MacOS only

if you get the following message, choose Allow.

MacOS warning message

You should get the following message: The REST endpoint was successfully started at port number/service name 12110 with XX threads.

Warning! Do not close the terminal window as that would stop the RDFox server. Also, any of the commands to add rules in this demo must be executed in a separate terminal window.

At this point you should be able to navigate to the RDFox web console at http://localhost:12110/console/

Create the Northwind Data Store

On the Console UI, click on + Create data store and name it “Northwind”.

Cancel the Import Content popup as we need to create a graph before importing the data.

Execute the following query on the RDFox web console to create the dataGraph where we are going to store the data and rules.

PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>CREATE GRAPH kggraph:dataGraph

Import The Northwind Sample Data

From … Menu, choose Add content

RDFox web console — add content

Select dataGraph from the drop down and then select the northind.nt file under the nortwind/data directory in your local branch or download it from github repo.

RDFox web console — upload file

You should get a confirmation message saying that 30780 facts were added to the data store.

Now, go to the beginning of this article for the instructions on how to create rules and run the SPARQL queries.

Once you are done with this demonstration, you can stop the RDFox Server by executing the command quit in the original terminal window.

References:

Northwind SQL vs SPARQL

Exploring an RDF Graph Database

Datalog Basics and RDFox

The Do's and Don'ts of Rule and Query Writing

Finding Patterns with Rules

Take your first steps towards a solution.

Start with a free RDFox demo!

Take your first steps towards a solution.

Get started with RDFox for free!

Team and Resources

The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).