Quantcast
Channel: english – How easy it is to make people believe a lie, and how hard it is to undo that work again
Viewing all articles
Browse latest Browse all 100

A REPLACE extension for Tracker’s SPARQL’s Update

$
0
0

SPARQL Update has INSERT and DELETE. To update an existing triple in RDF you need to DELETE it first. You of course already have our INSERT-SILENT but that just ignores certain errors; it doesn’t replace triples.

A (performance) problem is that with each DELETE having to solve all possible solutions you create an extra query for each time you want to update using a ‘DELETE-WHERE INSERT’-construction.

INSERT also checks for old values. It has to do this to implement SPARQL Update where you can’t insert a triple with a different value than the old value: If the value of a triple is identical, the insert for that triple is ignored; if the triple didn’t exist yet, it’s inserted; if the values aren’t identical, error is thrown — you need to use DELETE upfront.

Both having to do the extra delete and the old-values come at a performance price.

To solve this we plan to provide Tracker specific support for REPLACE. It’ll be Tracker specific simply because this isn’t specified in SPARQL Update. That has a probable reason:

Replacing or updating doesn’t fit well in the RDF world. Updating properties that have multiple values, like nie:keyword, is ambiguous: does it need to replace the entire list of values; does it need to append to the list; does it need to update just one item in the list, and which one? This probably explains why it’s not specified in SPARQL Update.

We decided to let our REPLACE be only different than INSERT for single value properties. For multi value properties will our REPLACE behave the same as normal INSERT.

How a GraphUpdated triggered by a REPLACE behaves is still being decided. Especially the value of the object’s ID for resource objects in the ‘deletes’-array. Having to look up the old ID kinda defeats the purpose of having a REPLACE (as we’d still need to look it up, like what an INSERT does, destroying part of the performance gain).

Either way, let me show you some examples:

We start with an insert of a resource that has a single value and two times a multi value property filled in:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title';
             nie:keyword 'keyw1';
             nie:keyword 'keyw2' }

A quick query to verify, and yes it’s in:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title, keyw1
  title, keyw2

If we repeat the query a second time then the old-values check will turn the insert into a noop:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title';
             nie:keyword 'keyw1';
             nie:keyword 'keyw2' }

And a quick query to verify that, and indeed nothing has changed:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title, keyw1
  title, keyw2

If we’d do that last insert query but with different values, we’d get this:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title new';
             nie:keyword 'keyw4';
             nie:keyword 'keyw3' }

SparqlError.Constraint: Unable to insert multiple values for subject
`r' and single valued property `dc:title' (old_value: 'title', new
 value: 'title new')

Note that for the two nie:keyword triples this would have worked, but given that each query is a transaction and because the nie:title part failed, aren’t those two written either.

Let’s now try the same with INSERT OR REPLACE (edit: changed from just REPLACE to INSERT OR REPLACE):

INSERT OR REPLACE { <r> a nie:InformationElement ;
                        nie:title 'title new';
                        nie:keyword 'keyw4';
                        nie:keyword 'keyw3' }

And a quick query now yields:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title new, keyw1
  title new, keyw2
  title new, keyw3
  title new, keyw4

You can see that how it behaved for nie:title was different than for nie:keyword. That’s because nie:title is a single value -and nie:keyword is a multi value property.

What if we do want to reset the multi value property and insert a complete new list? Simple, just do this as a single query (space or newline delimited) (edit: changed to INSERT OR REPLACE from just REPLACE):

DELETE { <r> nie:keyword ?k } WHERE { <r> nie:keyword ?k }
INSERT OR REPLACE { <r> a nie:InformationElement ;
                        nie:title 'title new';
                        nie:keyword 'keyw4';
                        nie:keyword 'keyw3' }

And a quick query now yields:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title new, keyw3
  title new, keyw4

The work on this is in progress. You can find it in the branch sparql-update. It’s working but especially the GraphUpdated stuff is unfinished.

Also note that the final syntax may change.


Viewing all articles
Browse latest Browse all 100

Trending Articles