Java
This page discusses using Java to interact with Stardog.
Page Contents
Overview
Stardog’s core API, SNARL (Stardog Native API for the RDF Language), is the preferred way to interact with Stardog. Under the hood, those APIs are just using our HTTP API, and thus all of Stardog’s features are available via Java.
Documentation
See the javadocs for SNARL’s documentation. We often just refer to this as Stardog’s Java API.
API Deprecation
Methods and classes in SNARL API that are marked with the com.google.common.annotations.Beta
are subject to change or removal in any release. We are using this annotation to denote new or experimental features, the behavior or signature of which may change significantly before it’s out of “beta”.
We will otherwise attempt to keep the public APIs as stable as possible, and methods will be marked with the standard @Deprecated
annotation for a least one full revision cycle before their removal from the SNARL API. See Compatibility Policies for more information about API stability.
Anything marked @VisibleForTesting
is just that, visible as a consequence of test case requirements; don’t write any important code that depends on functions with this annotation.
Add Dependencies with Maven
We support Maven for both client and server JARs. The following table summarizes the type of dependencies that you will have to include in your project, depending on whether the project is a Stardog client, or server, or both. Additionally, you can also include the Jena or RDF4J bindings if you would like to use them in your project. The Stardog dependency list below follows the Gradle convention and is of the form: groupId:artifactId:VERSION
.
Name | Stardog Dependency | Type |
---|---|---|
client | com.complexible.stardog:client-http:VERSION | pom |
server | com.complexible.stardog:server:VERSION | pom |
rdf4j | com.complexible.stardog.rdf4j:stardog-rdf4j:VERSION | jar |
jena | com.complexible.stardog.jena:stardog-jena:VERSION | jar |
You can see an example of their usage on Github.
If you’re using Maven as your build tool, then client-http
and server
dependencies require that you specify the packaging type as POM (pom
):
<dependency>
<groupId>com.complexible.stardog</groupId>
<artifactId>client-http</artifactId>
<version>$VERSION</version>
<type>pom</type>
</dependency>
Though Gradle may still work without doing this, it is still best practice to specify the dependency type there as well:
compile "com.complexible.stardog:client-http:${VERSION}@pom"
Public Maven Repo
The public Maven repository for the current Stardog release is https://maven.stardog.com. To get started, you need to add the following endpoint to your preferred build system, e.g. in your build script:
Gradle
repositories {
maven {
url "https://maven.stardog.com"
}
}
Maven
<repositories>
<repository>
<id>stardog-public</id>
<url>https://maven.stardog.com</url>
</repository>
</repositories>
Private Maven Repo
For access to nightly builds, priority bug fixes, priority feature access, hot fixes, etc. Enterprise Premium Support customers have access to their own private Maven repository that is linked to our internal development repository. We provide a private repository which you can either proxy from your preferred Maven repository manager–e.g. Artifactory or Nexus–or add the private endpoint to your build script.
This feature or service is available to Stardog customers. For information about licensing, please contact us.
Connecting to the Private Repo
Similar to our public Maven repo, we will provide you with a private URL and credentials to your private repo, which you will refer to in your build script like this:
Gradle
repositories {
maven {
url $yourPrivateUrl
credentials {
username $yourUsername
password $yourPassword
}
}
}
Maven
<repositories>
<repository>
<id>stardog-private</id>
<url>$yourPrivateUrl</url>
</repository>
</repositories>
Then in your ~/.m2/settings.xml
add:
<settings>
<servers>
<server>
<id>stardog-private</id>
<username>$yourUsername</username>
<password>$yourPassword</password>
</server>
</servers>
</settings>
Examples
We have many examples in our Github repo, but here are a few of the core examples to get you started:
- SNARL Overview - This examples shows how to use both the administrative and client APIs to perform some basic operations.
- RDF4J - A basic example of using Stardog via the RDF4J API.
- Jena bindings - Example of how to use the Jena integration with Stardog
- Reasoning - A small example program illustrating how to access Stardog’s reasoning capabilities.
- SNARL and Connection Pooling - A simple example to show how to setup and use ConnectionPools with Stardog.
- SNARL and Searching - A short example illustrating the use of the full text search capabilities in Stardog via the SNARL API.
Most notably in those examples, you will see how to use not only Stardog’s native API SNARL, but also how to use both Jena and RDF4J, which are the two most common RDF-based libraries in the Java world. We offer some commentary on the interesting parts of these examples below.
If you use Spring we have a specific library for you, which is outlined in the Spring section. If not, but you live in the Enterprise Java world, and we provide Pinto, which is similar to Jackson, but for Stardog + Graph.
Finally, if you’re just getting started, here’s how to get the Stardog libraries into your local development environment so you can start building. You will also want to check out how you can extend Stardog.
Java (SNARL) API Basics
Create a Database
You can create an empty database with default configuration options using the following lines of code for a server running locally:
try (AdminConnection aAdminConnection = AdminConnectionConfiguration.toServer("http://localhost:5820").credentials("admin", "admin").connect()) {
aAdminConnection.newDatabase("testConnectionAPI").create();
}
It’s crucially important to always clean up connections to the database by calling AdminConnection#close()
. Using try-with-resources
where possible is a good practice
The newDatabase
function returns a DatabaseBuilder
object which you can use to configure the options of the database you’d like to create. The create
function takes the list of files to bulk load into the database when you create it and returns a valid ConnectionConfiguration
which can be used to create new Connection
s to your database.
try (AdminConnection aAdminConnection = AdminConnectionConfiguration.toServer("http://localhost:5820").credentials("admin", "admin").connect()) {
aAdminConnection.newDatabase("test")
.set(SearchOptions.SEARCHABLE, true)
.create();
}
This illustrates how to create a temporary memory database named test
which supports full text search via Searching.
Creating a Connection String
As you can see, the ConnectionConfiguration
in com.complexible.stardog.api
package class is where the initial action takes place.
// to use the basic authentication
Connection aConn = ConnectionConfiguration
.to("exampleDB") // the name of the db to connect to
.server("http://localhost:5820") // the URL of the server to connect to
.credentials("admin", "admin") // credentials to use while connecting
.connect();
// to use the bearer authentication
Connection aConn = ConnectionConfiguration
.to("exampleDB") // the name of the db to connect to
.server("http://localhost:5820") // the URL of the server to connect to
.set(LoginConnectionConfiguration.IS_TOKEN, true) // set token authentication to true
.credentials(username, authToken) // credentials to use while connecting
.connect();
The to
method takes a database name as a string; and then connect
connects to the database using all specified properties on the configuration. This class and its constructor methods are used for all of Stardog’s Java APIs: SNARL native Stardog API, RDF4J, Jena, as well as HTTP. In the latter cases, you must also call server
and pass it a valid URL to the Stardog server using HTTP.
Without the call to server
, ConnectionConfiguration
will attempt to connect to an embedded version of the Stardog server running within the same JVM. This functionality has been deprecated, and the Stardog server should run externally.
Whether using SNARL, RDF4J, or Jena, most, if not all, Stardog Java code will use ConnectionConfiguration
to get a handle on a Stardog database and, after getting that handle, can use the appropriate API.
Adding Data
aConn.begin();
aConn.add()
.io()
.file(Paths.get("data/test.ttl"));
Collection<Statement> aGraph = Collections.singleton(
Values.statement(Values.iri("urn:subj"),
Values.iri("urn:pred"),
Values.iri("urn:obj")));
Resource aContext = Values.iri("urn:test:context");
aConn.add().graph(aGraph, aContext);
aConn.commit();
You must always enclose changes to a database within a transaction begin
and commit
or rollback
. Changes are local until the transaction is committed or until you try and perform a query operation to inspect the state of the database within the transaction.
By default, RDF added will go into the default context unless specified otherwise. As shown, you can use Adder
directly to add statements and graphs to the database; and if you want to add data from a file or input stream, you use the io
, format
, and stream
chain of method invocations.
Removing Data
// first start a transaction
aConn.begin();
aConn.remove()
.io()
.file(Paths.get("data/remove_data.nt"));
// and commit the change
aConn.commit();
Let’s look at removing data; in the example above, you can see that file or stream-based removal is symmetric to file or stream-based addition, i.e., calling remove
in an io
chain with a file or stream
call.
Parameterized SPARQL Queries
// A SNARL connection provides parameterized queries which you can use to easily
// build and execute SPARQL queries against the database. First, let's create a
// simple query that will get all of the statements in the database.
SelectQuery aQuery = aConn.select("select * where { ?s ?p ?o }");
// But getting *all* the statements is kind of silly, so let's actually specify a limit, we only want 10 results.
aQuery.limit(10);
// We can go ahead and execute this query which will give us a result set. Once we have our result set, we can do
// something interesting with the results.
// NOTE: We use try-with-resources here to ensure that our results sets are always closed.
try(SelectQueryResult aResult = aQuery.execute()) {
System.out.println("The first ten results...");
QueryResultWriters.write(aResult, System.out, TextTableQueryResultWriter.FORMAT);
}
// Query objects are easily parameterized; so we can bind the "s" variable in the previous query with a specific value.
// Queries should be managed via the parameterized methods, rather than created by concatenating strings together,
// because that is not only more readable, it helps avoid SPARQL injection attacks.
IRI aIRI = Values.iri("http://localhost/publications/articles/Journal1/1940/Article1");
aQuery.parameter("s", aIRI);
// Now that we've bound 's' to a specific value, we're not going to pull down the entire database with our query
// so we can go head and remove the limit and get all the results.
aQuery.limit(SelectQuery.NO_LIMIT);
// We've made our modifications, so we can re-run the query to get a new result set and see the difference in the results.
try(SelectQueryResult aResult = aQuery.execute()) {
System.out.println("\nNow a particular slice...");
QueryResultWriters.write(aResult, System.out, TextTableQueryResultWriter.FORMAT);
}
The Java API also lets us parameterize SPARQL queries. We can make a Query
object by passing a SPARQL query in the constructor.
Next, let’s set a limit for the results: aQuery.limit(10)
; or if we want no limit, aQuery.limit(SelectQuery.NO_LIMIT)
. By default, there is no limit imposed on the query object; we’ll use whatever is specified in the query. But you can use limit to override any limit specified in the query, however specifying NO_LIMIT
will not remove a limit specified in a query, it will only remove any limit override you’ve specified, restoring the state to the default of using whatever is in the query.
We can execute that query with execute()
and iterate over the results. We can also rebind the "?s"
variable easily: aQuery.parameter("s", aURI)
, which will work for all instances of "?s"
in any BGP in the query, and you can specify null to remove the binding.
Query objects are re-usable, so you can create one from your original query string and alter bindings, limit, and offset in any way you see fit and re-execute the query to get the updated results.
We strongly recommend the use of the Java API’s parameterized queries over concatenating strings together in order to build your SPARQL query. This latter approach opens up the possibility for SPARQL injection attacks unless you are very careful in scrubbing your input.
Getter Interface
aConn.get()
.subject(aURI)
.statements()
.forEach(System.out::println);
// `Getter` objects are parameterizable just like `Query`, so you can easily modify and re-use them to change
// what slice of the database you'll retrieve.
Getter aGetter = aConn.get();
// We created a new `Getter`, if we iterated over its results now, we'd iterate over the whole database; not ideal. So
// we will bind the predicate to `rdf:type` and now if we call any of the iteration methods on the `Getter` we'd only
// pull back statements whose predicate is `rdf:type`
aGetter.predicate(RDF.TYPE);
// We can also bind the subject and get a specific type statement, in this case, we'll get all the type triples
// for *this* individual. In our example, that'll be a single triple.
aGetter.subject(aURI);
System.out.println("\nJust a single statement now...");
aGetter.statements()
.forEach(System.out::println);
// `Getter` objects are stateful, so we can remove the filter on the predicate position by setting it back to null.
aGetter.predicate(null);
// Subject is still bound to the value of `aURI` so we can use the `graph` method of `Getter` to get a graph of all
// the triples where `aURI` is the subject, effectively performing a basic describe query.
Stream<Statement> aStatements = aGetter.statements();
System.out.println("\nFinally, the same results as earlier, but as a graph...");
RDFWriters.write(System.out, RDFFormats.TURTLE, aStatements.collect(Collectors.toList()));
The Java API also supports some sugar for the classic statement-level interactions. We ask in the first line of the snippet above for an iterator over the Stardog connection, based on aURI
in the subject position. Then a while-loop, as one might expect…You can also parameterize Getter
’s by binding different positions of the Getter
which acts like a kind of RDF statement filter—and then iterating as usual.
The aIter.close
which is important for Stardog databases to avoid memory leaks. If you need to materialize the iterator as a graph, you can do that by calling graph
.
The snippet doesn’t show object
or context
parameters on a Getter
, but those work, too, in the obvious way.
Reasoning
Stardog supports query-time reasoning using a query rewriting technique. In short, when reasoning is requested, a query is automatically rewritten to n
queries, which are then executed. As we discuss below in Connection Pooling, reasoning is enabled at the Connection
layer and then any queries executed over that connection are executed with reasoning enabled; you don’t need to do anything up front when you create your database if you want to use reasoning.
ReasoningConnection aReasoningConn = ConnectionConfiguration
.to("reasoningExampleTest")
.credentials("admin", "admin")
.reasoning(true)
.connect()
.as(ReasoningConnection.class);
In this code example, you can see that it’s trivial to enable reasoning for a Connection
: simply call reasoning
with true
passed in.
Search
Stardog’s search system can be used from Java. The fluent Java API for searching in SNARL looks a lot like the other search interfaces: We create a Searcher
instance with a fluent constructor: limit
sets a limit on the results; query
contains the search query, and threshold
sets a minimum threshold for the results.
// Let's create a Searcher that we can use to run some full text searches over the database.
// Here we will specify that we only want results over a score of `0.5`, and no more than `50` results
// for things that match the search term `mac`. Stardog's full text search is backed by [Lucene](http://lucene.apache.org)
// so you can use the full Lucene search syntax in your queries.
Searcher aSearch = aSearchConn.search()
.limit(50)
.query("mac")
.threshold(0.5);
// We can run the search and then iterate over the results
SearchResults aSearchResults = aSearch.search();
try (CloseableIterator<SearchResult> resultIt = aSearchResults.iterator()) {
System.out.println("\nAPI results: ");
while (resultIt.hasNext()) {
SearchResult aHit = resultIt.next();
System.out.println(aHit.getHit() + " with a score of: " + aHit.getScore());
}
}
// The `Searcher` can be re-used if we want to find the next set of results. We already found the
// first fifty, so lets grab the next page.
aSearch.offset(50);
aSearchResults = aSearch.search();
Then we call the search
method of our Searcher
instance and iterate over the results i.e., SearchResults
. Last, we can use offset
on an existing Searcher
to grab another page of results.
Client-Server Stardog
Stardog generally tries to be as lazy as possible; but in client-server mode, since state is maintained on the client, there are fewer chances to be lazy and more interactions with the server.
In client-server mode, everything triggers a round trip with these exceptions:
- closing a connection outside a transaction
- any parameterizations or other of a
Query
orGetter
instance - any database state mutations in a transaction that don’t need to be immediately visible to the transaction; that is, changes are sent to the server only when they are required, on commit, or on any query or read operation that needs to have the accurate up-to-date state of the data within the transaction.
Connection Client Buffers
The client implements data buffering before sending a set of changes to the server. Increasing the client side buffer may improve overall throughput in some workloads. The client buffer size by default is 100_000 statements. Servers with a larger number of CPUs may profit from a larger batch size. The size may be changed via setting ConnectionConfiguration.CLIENT_BUFFER_SIZE on the ConnectionConfiguration
.
Connection aConn = ConnectionConfiguration
.to("exampleDB") // the name of the db to connect to
.server("http://localhost:5820") // the URL of the server to connect to
.set(ConnectionConfiguration.CLIENT_BUFFER_SIZE, 500_000) // set client buffer size
.credentials(username, authToken) // credentials to use while connecting
.connect();
Connection Pooling
Stardog supports connection pools for SNARL Connection
objects for efficiency and programmer sanity. Here’s how they work:
// We need a configuration object for our connections, this is all the information about
// the database that we want to connect to.
ConnectionConfiguration aConnConfig = ConnectionConfiguration
.to("testConnectionPool")
.server("http://localhost:5820")
.credentials("admin", "admin");
// We want to create a pool over these objects. See the javadoc for ConnectionPoolConfig for
// more information on the options and information on the defaults.
ConnectionPoolConfig aConfig = ConnectionPoolConfig
.using(aConnConfig) // use my connection configuration to spawn new connections
.minPool(10) // the number of objects to start my pool with
.maxPool(1000) // the maximum number of objects that can be in the pool (leased or idle)
.expiration(1, TimeUnit.HOURS) // Connections can expire after being idle for 1 hr.
.blockAtCapacity(1, TimeUnit.MINUTES); // I want obtain to block for at most 1 min while trying to obtain a connection.
// now i can create my actual connection pool
ConnectionPool aPool = aConfig.create();
// if I want a connection object...
Connection aConn = aPool.obtain();
// now I can feel free to use the connection object as usual...
// and when I'm done with it, instead of closing the connection, I want to return it to the pool instead.
aPool.release(aConn);
// I could also use a try-with-resources block, as connections obtained from the pool will auto-release when `close()` is called
try (Connection anotherConn = aPool.obtain()) {
// Do more things in here and then let java release it back to the pool
}
// and when I'm done with the pool, shut it down!
aPool.shutdown();
Per standard practice, we first initialize security and grab a connection, in this case to the testConnectionPool
database. Then we setup a ConnectionPoolConfig
, using its fluent API, which establishes the parameters of the pool:
Parameter | Description |
---|---|
using | Sets which ConnectionConfiguration we want to pool; this is what is used to actually create the connections. |
minPool , maxPool | Establishes min and max pooled objects; max pooled objects includes both leased and idled objects. |
expiration | Sets the idle life of objects; in this case, the pool reclaims objects idled for 1 hour. |
blockAtCapacity | Sets the max time in minutes that we’ll block waiting for an object when there aren’t any idle ones in the pool. |
Whew! Next we can create
the pool using the ConnectionPoolConfig
.
Finally, we call obtain on the ConnectionPool
when we need a new one. And when we’re done with it, we return it to the pool so it can be re-used, by calling release
(or by closing the connection, which will also release it from the pool). When we’re done, we shutdown
the pool.
Since reasoning in Stardog is enabled per Connection
, you can create two pools: one with reasoning connections, one with non-reasoning connections; and then use the one you need to have reasoning per query; never pay for more than you need.
Using RDF4J
Stardog supports the RDF4J API; thus, for the most part, using Stardog and RDF4J is not much different from using RDF4J with other RDF databases. There are, however, at least two differences worth pointing out.
Wrapping Connections with StardogRepository
// Create a RDF4J Repository from a Stardog ConnectionConfiguration. The configuration will be used
// when creating new RepositoryConnections
Repository aRepo = new StardogRepository(ConnectionConfiguration
.to("testdb")
.server("http://localhost:5820")
.credentials("admin", "admin"));
// init the repo
aRepo.initialize();
// now you can use it like a normal RDF4J Repository
RepositoryConnection aRepoConn = aRepo.getConnection();
As you can see from the code snippet, once you’ve created a ConnectionConfiguration
with all the details for connecting to a Stardog database, you can wrap that in a StardogRepository
which is a Stardog-specific implementation of the RDF4J Repository interface. At this point, you can use the resulting Repository like any other Repository implementation. Each time you call StardogRepository.getConnection
, your original ConnectionConfiguration
will be used to spawn a new connection to the database.
Autocommit
The RDF4J API sets the autoCommit
mode ON by default whihc means every single statement added or deleted via the Connection
without an explicit begin()
/commit()
will incur the cost of a transaction, which is too heavyweight for most use cases. We VERY strongly recommend using explicit begin()
/commit()
so statments can be grouped intto transactions.
Using Jena
Stardog supports Jena by providing a custom implementations of Model
and Dataset
implementations. There are two points in the Jena example to emphasize.
Init in Jena
// obtain a Jena model for the specified stardog database connection. Just creating an in-memory
// database; this is roughly equivalent to ModelFactory.createDefaultModel.
Model aModel = SDJenaFactory.createModel(aConn);
The initialization in Jena is a bit different from either RDF4J or RDF4J; you can get a Jena Model
instance by passing the Connection
instance returned by ConnectionConfiguration
to the Stardog factory, SDJenaFactory
.
Add in Jena
// start a transaction before adding the data. This is not required,
// but it is faster to group the entire add into a single transaction rather
// than rely on the auto commit of the underlying stardog connection.
aModel.begin();
// read data into the model. note, this will add statement at a time.
// Bulk loading needs to be performed directly with the BulkUpdateHandler provided
// by the underlying graph, or by reading in files in RDF/XML format, which uses the
// bulk loader natively. Alternatively, you can load data into the Stardog
// database using SNARL, or via the command line client.
aModel.getReader("N3").read(aModel, new FileInputStream("data/sp2b_10k.n3"), "");
// done!
aModel.commit();
Jena also wants to add data to a Model
one statement at a time, which can be less than ideal. To work around this restriction, we recommend adding data to a Model
in a single Stardog transaction, which is initiated with aModel.begin
. Then to read data into the model, we recommend using RDF/XML, since that triggers the BulkUpdateHandler
in Jena or grab a BulkUpdateHandler
directly from the underlying Jena graph.
The other options include using the Stardog CLI client to bulk load a Stardog database or to use SNARL for loading and then switch to Jena for other operations, processing, query, etc.