How to Debug Reasoning

This tutorial summarizes the most common questions about reasoning in Stardog. Note that Stardog includes both logical and statistical reasoning capabilities. This tutorial focuses only on logical reasoning.

Page Contents

Background
A Motivating Example
The Usual Suspects

Background

Chances are pretty good you know what reasoning is, what it does, and why it’s important. But just in case, let’s take a very quick refresher on what reasoning is:

Reasoning is a declarative way to either derive new nodes and edges in a graph or specify integrity constraints on a (possibly distributed) graph or to do both at the same time.
Reasoning can replace arbitrary amounts of complex code and queries.
Reasoning transforms a graph data structure into a knowledge graph.
Reasoning in Stardog is fully integrated into SPARQL query evaluation.

A Motivating Example

Take, for example, the following trivial graph in the Turtle format:

:Square rdfs:subClassOf :Shape .    # This says that All Squares are Shapes
:MySquare a :Square .

Any plain graph database can store these 3 nodes (:Square, :Shape, and :MySquare) and 2 edges (rdfs:subClassOf and a).

Reasoning is the software service that lets Stardog infer some new (implicit, i.e., unstated) information from the known (i.e., explicit) data. In this case, the inference is just one new edge between two existing nodes:

:MySquare a :Shape .

Stardog doesn’t store inferences by default. Rather Stardog infers them on the fly as needed when answering queries. That’s important because what if the different parts of this graph are distributed over enterprise data silos and need to stay there?

The Usual Suspects

You can find repeated types of questions in the Stardog Community forums where users aren’t seeing expected query results. These often come down to a reasoning setting or misunderstanding of how it works. Here are a few of the most common questions we have seen.

“I’m not seeing any results!”

The most simple problem to fix, and by extension the easiest thing to check when things aren’t working, is the case where reasoning isn’t enabled at all. “But wait a minute,” I hear you ask, “If reasoning is so important, why would it ever NOT be enabled?”

One answer is that it can be expensive. But the other answer is that you should use reasoning in a way that makes sense for your use case. Stardog does not materialize (i.e., explicitly calculate and store) inferences, instead finding them as needed at query time. Therefore if a query doesn’t need reasoning to get the required results, it makes no sense to make everyone else pay the cost of computing.

If your problem is that query results only contain information that is explicit, this could be the problem. The method of enabling reasoning for your queries depends on how they’re being run:

CLI: Ensure that stardog query execute is passed either the -r or --reasoning flag.
Java: When creating a Connection object via ConnectionConfiguration, ensure that the reasoning() method is called with a true value
HTTP: Ensure that the reasoning query parameter is present in your URL, or form body, with the value true
Stardog Studio: Ensure that the Reasoning toggle in the Workspace set to ON.

If this works, then congratulations! If not, read on.

I’m not seeing the right results!”

Okay, so reasoning is enabled, but what if you’re still not seeing the results that you know you should be seeing? It could be related to reasoning level or to the schema location.

Reasoning Level

You may not see expected results because the wrong reasoning level is being used. A profile or “reasoning level” is a bundle or family of data modeling features (called, for historical reasons, “axioms”) that are often used together. Some levels are more expressive (and thus more expensive) than others, so you want to choose the cheapest one that works. Stardog supports the following reasoning levels: RDFS, QL, RL, EL, DL, SL, and NONE. If you are missing results that you know should be there, check the stardog.log file.

Often when we receive issues like this, the log file will contain lines that look like this:

Not a valid SL axiom: Range(...).

Typically this means that the reasoning level is set to SL (the default), but the user has included OWL DL axioms, which are not covered by SL. When stardog.log shows lines like this, the implication is that the axiom(s) in question will be ignored completely, which is often the reason for the “missing” results, as they depended on the axiom.

By default Stardog uses the SL level because it’s the most expressive level that can be computed efficiently over large graphs. You can use the reasoning schemaCLI command to see which axioms are included during reasoning.

The easiest solution may be to enable the database configuration option reasoning.approximate which will, when possible, split troublesome axiom(s) into two and use the axiom that fits into SL level. You can also try using Stardog Rules. Then you can look at rule gotchas to see if there are any issues with how you’re using rules. If you have a very small amount of data, you may try using the DL reasoning level.

Schema Location

Another cause we’ve seen for not seeing the expected results is connected to where the Stardog schema is in the graph. The schema here is just the set of axioms you want to use in reasoning. But, as mentioned above, those can be distributed (physically) so Stardog will work hard to find them.

Practically this means that Stardog needs to know which named graph(s) contain the schema. So you may need to check the value of reasoning.named.graphs property in stardog.properties to the correct value.

Our documentation has a detailed discussion of other reasons you might not be seeing the results you want. It’s a good read.

“My Schema NEEDS axiom X and axiom Y!”

Maybe? But maybe not. Stardog Rules are very powerful and are only getting easier to write. Think of Stardog Rules as Datalog in the graph because Stardog Rules are (basically) Datalog in the graph.

Background
A Motivating Example
The Usual Suspects