How to Debug Reasoning
This tutorial summarizes the most common questions about reasoning in Stardog. Note that Stardog includes both logical and statistical reasoning capabilities. This tutorial focuses only on logical reasoning.
Page Contents
Background
Chances are pretty good you know what reasoning is, what it does, and why it’s important. But just in case, let’s take a very quick refresher on what reasoning is:
- Reasoning is a declarative way to either derive new nodes and edges in a graph or specify integrity constraints on a (possibly distributed) graph or to do both at the same time.
- Reasoning can replace arbitrary amounts of complex code and queries.
- Reasoning transforms a graph data structure into a knowledge graph.
- Reasoning in Stardog is fully integrated into SPARQL query evaluation.
A Motivating Example
Take, for example, the following trivial graph in the Turtle format:
:Square rdfs:subClassOf :Shape . # This says that All Squares are Shapes
:MySquare a :Square .
Any plain graph database can store these 3 nodes (:Square
, :Shape
, and :MySquare
) and 2 edges (rdfs:subClassOf
and a
).
Reasoning is the software service that lets Stardog infer some new (implicit, i.e., unstated) information from the known (i.e., explicit) data. In this case, the inference is just one new edge between two existing nodes:
:MySquare a :Shape .
Stardog doesn’t store inferences by default. Rather Stardog infers them on the fly as needed when answering queries. That’s important because what if the different parts of this graph are distributed over enterprise data silos and need to stay there?
The Usual Suspects
You can find repeated types of questions in the Stardog Community forums where users aren’t seeing expected query results. These often come down to a reasoning setting or misunderstanding of how it works. Here are a few of the most common questions we have seen.
“I’m not seeing any results!”
The most simple problem to fix, and by extension the easiest thing to check when things aren’t working, is the case where reasoning isn’t enabled at all. “But wait a minute,” I hear you ask, “If reasoning is so important, why would it ever NOT be enabled?”
One answer is that it can be expensive. But the other answer is that you should use reasoning in a way that makes sense for your use case. Stardog does not materialize (i.e., explicitly calculate and store) inferences, instead finding them as needed at query time. Therefore if a query doesn’t need reasoning to get the required results, it makes no sense to make everyone else pay the cost of computing.
If your problem is that query results only contain information that is explicit, this could be the problem. The method of enabling reasoning for your queries depends on how they’re being run:
- CLI: Ensure that
stardog query execute
is passed either the-r
or--reasoning
flag. - Java: When creating a
Connection
object viaConnectionConfiguration
, ensure that thereasoning()
method is called with atrue
value - HTTP: Ensure that the
reasoning
query parameter is present in your URL, or form body, with the valuetrue
- Stardog Studio: Ensure that the Reasoning toggle in the Workspace set to ON.
If this works, then congratulations! If not, read on.
I’m not seeing the right results!”
Okay, so reasoning is enabled, but what if you’re still not seeing the results that you know you should be seeing? It could be related to reasoning level or to the schema location.
Reasoning Level
You may not see expected results because the wrong reasoning level is being used. A profile or “reasoning level” is a bundle or family of data modeling features (called, for historical reasons, “axioms”) that are often used together. Some levels are more expressive (and thus more expensive) than others, so you want to choose the cheapest one that works. Stardog supports the following reasoning levels: RDFS, QL, RL, EL, DL, SL, and NONE. If you are missing results that you know should be there, check the stardog.log
file.
Often when we receive issues like this, the log file will contain lines that look like this:
Not a valid SL axiom: Range(...).
Typically this means that the reasoning level is set to SL (the default), but the user has included OWL DL axioms, which are not covered by SL. When stardog.log
shows lines like this, the implication is that the axiom(s) in question will be ignored completely, which is often the reason for the “missing” results, as they depended on the axiom.
By default Stardog uses the SL level because it’s the most expressive level that can be computed efficiently over large graphs. You can use the reasoning schema
CLI command to see which axioms are included during reasoning.
The easiest solution may be to enable the database configuration option reasoning.approximate
which will, when possible, split troublesome axiom(s) into two and use the axiom that fits into SL level. You can also try using Stardog Rules. Then you can look at rule gotchas to see if there are any issues with how you’re using rules. If you have a very small amount of data, you may try using the DL reasoning level.
Schema Location
Another cause we’ve seen for not seeing the expected results is connected to where the Stardog schema is in the graph. The schema here is just the set of axioms you want to use in reasoning. But, as mentioned above, those can be distributed (physically) so Stardog will work hard to find them.
Practically this means that Stardog needs to know which named graph(s) contain the schema. So you may need to check the value of reasoning.named.graphs
property in stardog.properties
to the correct value.
Our documentation has a detailed discussion of other reasons you might not be seeing the results you want. It’s a good read.
“My Schema NEEDS axiom X and axiom Y!”
Maybe? But maybe not. Stardog Rules are very powerful and are only getting easier to write. Think of Stardog Rules as Datalog in the graph because Stardog Rules are (basically) Datalog in the graph.