'Converting RDFConnection.load(String graphName, Model model) into SPARQL Update notation while specifying named graph

I am working on a code base that uses Apache Jena (3.14.0) to save triples into either Anzo or Fuseki (for local testing).

I am trying to adapt the code to support AWS Neptune - see related question.

A fellow SO user brought my attention to the fact that Neptune does not support GSP.

The code I'm looking at persists triples with the RDFConnection.load(String graphName, Model model) notation.

My idea was to convert it to RDFConnection.update(Update update).

In other words:

myRdfConnectionInstance.load( myGraphNameString, myJenaModel )

... would become something in the lines of:

myRdfConnectionInstance.update(
    new UpdateBuilder()
        .addInsert( myGraphNameString, myJenaModel )
        .build()
);

(myGraphNameString represents a URN)

My take was that this notation would employ the SPARQL update protocol as opposed to GSP, hence enabling persisting the triples in Neptune.

I was comforted in that regard by the fact that, if I omitted the named graph parameter and just invoked .addInsert( myJenaModel ), the request would be valid with all triple stores I tried it with.

Unfortunately the same parametrized with a named graph fails not only with Neptune, but also with my local Fuseki store.

The javadoc states:

Add all the statements in the model a specified graph to the insert statement.[...]

... which was confusing in terms of English, but seemed to lean towards what I wanted.

I suspect the second part of the description:

The graph object is converted by a call to makeNode().

... is where I'm messing up.

Unfortunately I happen to be neither too familiar with triple stores in depth, nor with Jena, so I don't know where to chase next.

Questions

  • Is RDFConnection#update the right direction to convert the write notation to SPARQL update, hence preparing for compatibility with Neptune?
  • If so, what am I missing about the parametrization of the graph name?
  • If there any additional documentation that would be relevant, aside from the APIs quoted here?

Some error messages

The response I get from Neptune looks like (formatting added for clarity):

Http exception response 
{
"detailedMessage":"Malformed query: Illegal subject value: 
    \"urn:[my URN]\"^^<http://www.w3.org/2001/XMLSchema#string> [line 2]",
"code":"MalformedQueryException","requestId":"[some UUID]"
}

No explicit error message from Fuseki, just the HTTP 400.

The stack trace looks like:

org.apache.jena.atlas.web.HttpException: 400 - Bad Request
    at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1091)
    at org.apache.jena.riot.web.HttpOp.execHttpPost(HttpOp.java:721)
    at org.apache.jena.riot.web.HttpOp.execHttpPost(HttpOp.java:517)
    at org.apache.jena.riot.web.HttpOp.execHttpPost(HttpOp.java:473)
    at org.apache.jena.rdfconnection.RDFConnectionRemote.lambda$updateExec$6(RDFConnectionRemote.java:324)
    at org.apache.jena.rdfconnection.RDFConnectionRemote.exec(RDFConnectionRemote.java:668)
    at org.apache.jena.rdfconnection.RDFConnectionRemote.updateExec(RDFConnectionRemote.java:324)
    at org.apache.jena.rdfconnection.RDFConnectionRemote.update(RDFConnectionRemote.java:311)
    at org.apache.jena.rdfconnection.RDFConnection.update(RDFConnection.java:250)
    at [my code]


Solution 1:[1]

Bulk load into Netpune seems to be feasible in a two step process via RDF4J. Wrote a program to generate an RDF (NT file) and then used RDF4J console to load it manually into Neptune.

Guess : If we dig in a bit more into RDF4J and how Neptune is accepting bulk load from it, it might be feasible to do the entire load within the same program.

$ ./bin/console.sh
04:10:40.412 [main] DEBUG org.eclipse.rdf4j.common.platform.PlatformFactory - os.name = linux
04:10:40.416 [main] DEBUG org.eclipse.rdf4j.common.platform.PlatformFactory - Detected Posix platform
Connected to default data directory
RDF4J Console 3.6.3
Working dir: /home/bitnami/tools/eclipse-rdf4j-3.6.3
Type 'help' for help.
> create sparql
Please specify values for the following variables:
SPARQL query endpoint: https://yyy.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/sparql
SPARQL update endpoint: https://yyy.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/sparql
Local repository ID [endpoint@localhost]: test
Repository title [SPARQL endpoint repository @localhost]: test Graph data model PoC
Repository created
> open test

test> sparql select ?s ?p ?o where {?s ?p ?o} limit 10
Evaluating SPARQL query...
+------------------------+------------------------+------------------------+
| s                      | p                      | o                      |
+------------------------+------------------------+------------------------+
| <https://test.com/s>   | <https://test.com/p>   | <https://test.com/o>   |
+------------------------+------------------------+------------------------+
1 result(s) (671 ms)

test> clear

test> load /home/bitnami/projects/model/sparql-client/output/model.nt

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 sandeepkunkunuru