Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SKOS importer for SKOS-XL: auto-publishing issue #347

Open
mgbeyer opened this issue Aug 12, 2015 · 0 comments
Open

SKOS importer for SKOS-XL: auto-publishing issue #347

mgbeyer opened this issue Aug 12, 2015 · 0 comments

Comments

@mgbeyer
Copy link

mgbeyer commented Aug 12, 2015

Suppose you have a scenario depicted by the N-Triples example below:

<http://lod.gesis.org/thesoz/concept_10099999> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://lod.gesis.org/thesoz/concept_10099999> <http://www.w3.org/2004/02/skos/core#inScheme> <http://lod.gesis.org/thesoz/> .
<http://lod.gesis.org/thesoz/concept_10099999> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099999_de> .
<http://lod.gesis.org/thesoz/concept_10099999> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099999_en> .
<http://lod.gesis.org/thesoz/concept_10099999> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099999_fr> .
<http://lod.gesis.org/thesoz/term_10099999_de> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099999_de> <http://www.w3.org/2008/05/skos-xl#literalForm> "hallo"@de .
<http://lod.gesis.org/thesoz/term_10099999_en> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099999_en> <http://www.w3.org/2008/05/skos-xl#literalForm> "hello"@en .
<http://lod.gesis.org/thesoz/term_10099999_fr> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099999_fr> <http://www.w3.org/2008/05/skos-xl#literalForm> "bla ble blu"@fr .
<http://lod.gesis.org/thesoz/concept_10099998> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://lod.gesis.org/thesoz/concept_10099998> <http://www.w3.org/2004/02/skos/core#inScheme> <http://lod.gesis.org/thesoz/> .
<http://lod.gesis.org/thesoz/concept_10099998> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099998_de> .
<http://lod.gesis.org/thesoz/concept_10099998> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099998_en> .
<http://lod.gesis.org/thesoz/concept_10099998> <http://www.w3.org/2008/05/skos-xl#prefLabel> <http://lod.gesis.org/thesoz/term_10099998_fr> .
<http://lod.gesis.org/thesoz/term_10099998_de> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099998_de> <http://www.w3.org/2008/05/skos-xl#literalForm> "dummy"@de .
<http://lod.gesis.org/thesoz/term_10099998_en> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099998_en> <http://www.w3.org/2008/05/skos-xl#literalForm> "dummy"@en .
<http://lod.gesis.org/thesoz/term_10099998_fr> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2008/05/skos-xl#Label> .
<http://lod.gesis.org/thesoz/term_10099998_fr> <http://www.w3.org/2008/05/skos-xl#literalForm> "bla ble blu"@fr .

So there are two different concepts and each one references multiple skos-xl:Label/skos-xl:literalForm instances via skos-xl#prefLabel, representing different language versions of a term. If you take a look at the corresponding XML the hierarchy becomes more obvious:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cc="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/terms/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:void="http://rdfs.org/ns/void#" xmlns:skosxl="http://www.w3.org/2008/05/skos-xl#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:thesoz="http://lod.gesis.org/thesoz/ext/" xmlns:prv="http://purl.org/net/provenance/ns#" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:base="http://lod.gesis.org/thesoz/">
   <rdf:Description rdf:about="concept_10099999">
      <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
   </rdf:Description>
   <rdf:Description rdf:about="concept_10099999">
      <skos:inScheme rdf:resource="http://lod.gesis.org/thesoz/"/>
   </rdf:Description>
   <rdf:Description rdf:about="concept_10099999">
      <skosxl:prefLabel rdf:resource="term_10099999_de"/>
      <skosxl:prefLabel rdf:resource="term_10099999_en"/>
      <skosxl:prefLabel rdf:resource="term_10099999_fr"/>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099999_de">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="de">hallo</skosxl:literalForm>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099999_en">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="en">hello</skosxl:literalForm>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099999_fr">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="fr">bla ble blu</skosxl:literalForm>
   </rdf:Description>
   <rdf:Description rdf:about="concept_10099998">
      <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
   </rdf:Description>
   <rdf:Description rdf:about="concept_10099998">
      <skos:inScheme rdf:resource="http://lod.gesis.org/thesoz/"/>
   </rdf:Description>
   <rdf:Description rdf:about="concept_10099998">
      <skosxl:prefLabel rdf:resource="term_10099998_de"/>
      <skosxl:prefLabel rdf:resource="term_10099998_en"/>
      <skosxl:prefLabel rdf:resource="term_10099998_fr"/>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099998_de">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="de">dummy</skosxl:literalForm>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099998_en">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="en">dummy</skosxl:literalForm>
   </rdf:Description>
   <rdf:Description rdf:about="term_10099998_fr">
      <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/>
      <skosxl:literalForm xml:lang="fr">bla ble blu</skosxl:literalForm>
   </rdf:Description>
</rdf:RDF>

Now there are two different literalForm N-Triples (with different origins) including the same value for the same language (which totally can happen in real life):

<http://lod.gesis.org/thesoz/term_10099998_fr> <http://www.w3.org/2008/05/skos-xl#literalForm> "bla ble blu"@fr .
<http://lod.gesis.org/thesoz/term_10099999_fr> <http://www.w3.org/2008/05/skos-xl#literalForm> "bla ble blu"@fr .

The problem: The importer won't auto-publish the duplicate and says something like "Publishing failed, subject xyz invalid, value has already been taken". This is actively taken account of in the validation part of the corresponding model in form of a uniqueness restriction (see: app\models\label\skosxl\validations.rb around line # 13 validates :value, uniqueness: { scope: [:language, :rev] }, if: :validatable_for_publishing?

I don't quite get why this validation is deliberately happening here? See, I'm no expert when it comes to the whole SKOS-XL format, so maybe I just don't know better. I'm aware that W3C says "No two concepts in the same concept scheme may have the same value for skos:prefLabel in a given language". But here we have two totally different (also origin-wise) non-core label definitions (which just happen to contain the same value in the same language). So our two concepts do NOT reference the same label-instance but two totally DIFFERENT label definitions each (afaik different labels can have the same literal form). Is this in any way a bad thing and not W3C/format-conform? The whole thing is imported just fine. But why the duplicates are excluded from auto-publishing then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant