SQL2RDFincremental materialization uses a stream of source data changes (DML) and transforms them to the corresponding changes (inserts, deletes) in the triple store ensuring concurrency between RDBMS and triplestore. SQL2RDFincremental materialization is based on the same R2RML mapping used for virtualization (Ontop), and (Capsenta), and materialization (In4mium, and R2RML Parser) thus does not incur an additional configuration maintenance problem.

Mapping an existing RDBMS to RDF has become an increasingly popular way of accessing data when the system-of-record is not, or cannot be, in a RDF database. Despite its attractiveness, virtualization is not always possible for various reasons such as performance, the need for full SPARQL 1.1 support, the need to reason over the virtualized as well as other materialized data, etc. However materialization of an existing RDBMS to RDF is also not an ideal alternative. Reasons include the inevitable lack of concurrency between the system-of-record and the RDF.

Thus incremental materialization provides an alternative between virtualization and materialization, offering:

  • Easy ETL of source RDB as no additional configuration required other than the same R2RML required for virtualization or bulk materialization.
  • Improved concurrency of data compared with materialization.
  • Significantly less computational overhead than a full materialization of the source data.
  • Improved query performance compared with virtualization especially when reasoning is required.
  • Compatibility with R2RML thus reducing configuration maintenance.
  • Supports insert, update, and delete changes on the source RDBMS.
  • Source transactions can be batched, and the changes to the triplestore are part of a single transaction that can be rolled back in the event of a problem.
  • Supports change logging so that committed changes can be rolled back.

Contact  inova8 if you would like to try SQL2RDF

When a subject-matter-expert (SME) describes their data domain they use English language. In response we give them visually beautiful, but often undecipherable, diagrams. Here we propose an alternative: a ‘word-model’ that describes the model in structured English without loss of accuracy or completeness.

Creating an ontological model invariably involves a subject-matter-expert (SME) who has an in-depth knowledge of the domain to be modelled working together with a data modeler or ontologist who will translate the domain into an ontological model.

When a subject-matter-expert (SME) is in discussion with a data modeler, they will be describing their information using English language:

“I have customers who are assigned to regions, and they create orders with a salesperson who prepares the order-details and assigns a shipper that operates in that region”

In response a data modeler will offer the subject-matter-expert (SME) an E-R diagram or an ontology graph!

Figure 1: Geek Model Diagrams

No wonder we are regarded as geeks! Although visually appealing, they are difficult for an untrained user to verify the model that the diagram is describing.

Instead we could document the model in a more easily consumed way with a ‘word-model’. This captures the same details of the model that are present in the various diagrams, but uses structured English and careful formatting as shown below.

aCustomer

  • places anOrder
    • made by anEmployee
      • who reports to anotherEmployee
      • and who operates in aTerritory
    • and contains anOrderDetail
      • refers to aProduct
        • which is supplied by aSupplier
        • and which categorized as aCategory
      • and is shipped by aShipper
      • and is assigned to aRegion
        • which has aTerritory
      • and belongs to aRegion

The basic principle is that we start at any entity, in this case ‘Customer’ and traverse the edges of the graph that describe that entity. At each new node we indent. Each line is a combination of the predicate and range class. No rules regarding order in which we visit the nodes and edges. No rules regarding depth. The only rule should be that, if we want a complete word-model, the description traverses all edges in one direction or the other.

These word-models are useful in the model design phase, as they are easy to document using any editor. Since the word-model is in ‘plain English’, it is easy for a SME to verify the accuracy of the model, and mark up with comments and edits. However word-models are also easy to generate from an RDF/OWL model.

Enhancements to Word-Model

We can refine the contents of the word-model as we develop the model with the SME. We can also enhance the readability by decorating the word-model text with the following fonts:

Word-Model Legend
Italics indicate an instance of a class, a thing
Bold indicates a class
Underline is a predicate or property that relates instances of classes
BoldItalic is used for cardinality expressions

Level 1a: Add the categorization of entities

Rather than using an example of an entity, we can qualify with the class to which the sample entity belongs.

aCustomer, a Customer

  • places anOrder, an Order
    • made by anEmployee, an Employee
      • who reports to anotherEmployee, an Employee
      • and who operates in aTerritory, a Territory
    • and contains anOrderDetail, an OrderDetail
      • and refers to aProduct, a Product
        • which is supplied by aSupplier, a Supplier
        • and which is categorized as aCategory, a Category
      • and is shipped by aShipper, a Shipper
      • and is assigned to aRegion, a Region
        • which has aTerritory, a Territory
      • and belongs to aRegion, a Region

Level 1b: Add cardinality of predicates

We can also add the cardinality as a modifier between the predicate and the entity:

aCustomer

  • who places zero, one or more orders
    • each made by one anEmployee
      • who reports to zero or one anotherEmployee
      • and who operates in one or more aTerritorys
    • and each contains one or more anOrderDetails
      • which refers to one aProduct
        • which is supplied by one aSupplier
        • and which is categorized as one aCategory
      • and is shipped by one aShipper
      • and is assigned to one aRegion
        • which has one or more aTerritorys
      • and belongs to one aRegion

Level 2: Add categorization and cardinality

Of course we can combine these extensions into a single word-modol as shown below.

aCustomer, a Customer

  • who places zero, one or more orders, each an Order
    • each made by one anEmployee, an Employee
      • who reports to zero or one anotherEmployee, an Employee
      • and who operates in one or more aTerritorys, each a Territory
    • and each contains one or more anOrderDetails, each an OrderDetail
      • which refers to one aProduct, a Product
        • which is supplied by one aSupplier, a Supplier
        • and which is categorized as one aCategory, a Category
      • and is shipped by one aShipper, a Shipper
      • and is assigned to one aRegion, a Region
        • which has one or more aTerritorys, each a Territory
      • and belongs to one aRegion, a Region

Despite the completeness of what is being described by this word-model, it is still easy to read by SMEs.

Auto-generation of Word-Model

Once the word-model has been formally documented in RDF/OWL, we can still use a word-model to document the RDF/OWL by auto-generating a word-model from the underlying RDF/OWL ontology as shown below.

Figure 2: Word-Model

This was generated using a SPIN magic-property as follows:

select ?modelDefinition
{
   (model:Customer 4 false) :modelDefinition ?modelDefinition .
}

This auto-generation can go further by including the datatype properties associated with each entity as shown below:

Figure 3: Word-Model including datatype properties

This was generated using a SPIN magic-property as follows:

select ?modelDefinition
{
   (model:Customer 4 true) :modelDefinition ?modelDefinition .
}

Appendix

SPIN Magic Properties:

The following SPIN properties were defined for auto generation of the word-model in HTML format:

classProperties

SELECT ?classProperties
WHERE {
{SELECT  ?arg1   (    IF(?arg2CONCAT("(",?properties,")"),"") as ?classProperties) WHERE
    {
        SELECT ?arg1 ((GROUP_CONCAT(CONCAT("<i>", ?dataPropertyLabel, "</i>"); SEPARATOR=', ')) AS ?properties)
        WHERE {
           # BIND (model:Product AS ?arg1) .
            ?arg1 a ?class .
            FILTER (?class IN (rdfs:Class, owl:Class)) .
            ?arg1 rdfs:label ?classLabel .
            ?dataProperty a owl:DatatypeProperty .
            ?dataProperty rdfs:domain ?arg1 .
            ?dataProperty rdfs:label ?dataPropertyLabel .
        }
        GROUP BY ?arg1
    }
}
}

classDefinition

SELECT ?classDefinition ?priorPath
WHERE {
    {
        SELECT ?arg1 ?arg2  ((GROUP_CONCAT( ?definition; SEPARATOR='<br/>and that ')) AS ?classDefinition)  ((GROUP_CONCAT( ?pastPath; SEPARATOR='\t')) AS ?priorPath)
        WHERE {
           ?arg1 a ?class . FILTER( ?class in (rdfs:Class, owl:Class ))
            ?arg1 rdfs:label ?classLabel .
            ?objectProperty a owl:ObjectProperty .
            {
                        ?objectProperty rdfs:domain ?arg1 .
                        ?objectProperty rdfs:label ?objectPropertyLabel .
                        ?objectProperty rdfs:range ?nextClass .
                        ?nextClass rdfs:label ?nextClassLabel .   BIND(?objectProperty as ?property)
            }UNION{
                        ?objectProperty  owl:inverseOf ?inverseObjectProperty .
                        ?objectProperty rdfs:domain  ?nextClass.
                        ?inverseObjectProperty rdfs:label ?objectPropertyLabel .
                        ?objectProperty rdfs:range ?arg1 .
                        ?nextClass rdfs:label ?nextClassLabel .    BIND(?inverseObjectProperty as ?property)
            }UNION{
                        ?inverseObjectProperty  owl:inverseOf ?objectProperty .
                        ?objectProperty rdfs:domain  ?nextClass.
                        ?inverseObjectProperty rdfs:label ?objectPropertyLabel .
                        ?objectProperty rdfs:range  ?arg1 .
                        ?nextClass rdfs:label ?nextClassLabel .    BIND(?inverseObjectProperty as ?property)
            }
#Stop from going too deep
            BIND(?arg2 -1 as ?span) FILTER(?span>0). 
            ?nextClass a ?nextClassClass. FILTER( ?nextClassClass in (rdfs:Class, owl:Class ))
#,  odata4sparql:Operation))  .
#Do not process an already processed arc (objectProperty)         
            BIND(CONCAT(?arg4,"\t",?objectPropertyLabel) as ?forwardPath) FILTER( !CONTAINS(?arg4, ?objectPropertyLabel ))  
            (?nextClass ?span ?arg3 ?forwardPath ) :classDefinition (?nextClassDefinition  ?nextPath).           
#Do not include if arc (objectProperty) appears already   
            FILTER(  !CONTAINS(?nextPath, ?objectPropertyLabel )) BIND(CONCAT( ?objectPropertyLabel, IF(?nextPath="","",CONCAT("\t",?nextPath))) as ?pastPath)
                                    (?nextClass ?arg3) :classProperties ?nextClassProperties .
            BIND (CONCAT("<u>",?objectPropertyLabel , "</u> <b>", ?nextClassLabel, "</b>",  ?nextClassProperties, IF ((?nextClassDefinition!=""), CONCAT("<br/><blockquote>that ",  ?nextClassDefinition, "</blockquote>"), "")  ) as ?definition)
        }
        GROUP BY ?arg1 ?arg2
    } .
}

modelDefintion

SELECT ?modelDefinition
WHERE {
    {
        SELECT ?arg1 ?arg2 ?arg3 ((CONCAT("<b>", ?classLabel, "</b>", ?nextClassProperties, "<blockquote>that ", ?classDefinition, "</blockquote>")) AS ?modelDefinition)
        WHERE {
           # BIND (model:Order AS ?arg1) .  BIND (4 AS ?arg2) . BIND (false AS ?arg3) .
            ( ?arg1 ?arg2 ?arg3 "") :classDefinition (?classDefinition "") .
            ( ?arg1 ?arg3 ) :classProperties ?nextClassProperties .
            ?arg1 rdfs:label ?classLabel .
        }
    } .
}

Usage

The following will return the HTML description of the model, starting with model:Customer, stopping at a depth of 4 in any direction, and not including the datatypeproperty definition.

select ?modelDefinition
{
   (model:Customer 4 false) :modelDefinition ?modelDefinition .
}

SKOS, the Simple Knowledge Organization System, offers an easy to understand schema for vocabularies and taxonomies. However modeling precision is lost when skos:semanticRelation predicates are introduced.

Combining SKOS with RDFS/OWL allows both the precision of owl:ObjectProperty to be combined with the flexibility of SKOS. However clarity is then lost as the number of core concepts (aka owl:Class) grow.
Many models are not just documenting the ‘state’ of an entity. Instead they are often tracking the actions performed on entities by agents at locations. Thus aligning the core concepts to the Activity, Entity, Agent, and Location classes of the PROV ontology provides a generic upper-ontology within which to organize the model details.

Vehicle Manufacturing Example

This examples captures information about vehicle manufacturing. Following

  1. Manufacturers: the manufacturer of models of cars in various production lines sited at plants
  2. Models: the models that the manufacturer produces
  3. ProductionLines: the production lines set up to produce models of vehicles on behalf of a manufacturer
  4. Plants: the plants that house the production lines

In addition there are different ‘styles’ of manufacturing that occur for various models and various sites:

  1. Manufacturing: the use of a ProductionLine for a particular Model

SKOS Modeling

If we follow a pure SKOS model we proceed as follows by creating a VehicleManufacturingScheme  skos:ConceptScheme

s:VehicleManufacturingScheme

rdf:type skos:ConceptScheme

.

Then we create skos:topConceptOf Manufacturer, Model, Plant, and Production as follows:

 

s:Manufacturer

rdf:type owl:Class ;

rdfs:subClassOf skos:Concept ;

skos:topConceptOf s:VehicleManufacturingScheme

.

s:Model

rdf:type owl:Class ;

rdfs:subClassOf skos:Concept ;

skos:topConceptOf s:VehicleManufacturingScheme

.

These top-level concepts are being created of type owl:Class and a subClassOf skos:Concept.  This is the pattern recommended in (Bechhofer, et al.)

Finally we can create skos:broader concepts as follows:

s:Ford

rdf:type s:Manufacturer ;

skos:broader s:Manufacturer ;

skos:inScheme s:VehicleManufacturingScheme

.

s:Fusion

rdf:type s:Model ;

skos:broader s:Model ;

skos:inScheme s:VehicleManufacturingScheme

.

The resultant SKOS taxonomy of the VehicleManufacturingScheme  skos:ConceptScheme then appears as follows:

Figure 1: SKOS taxonomy

Why?

By starting with a pure SKOS model we provide access to the underling concepts in a more accessible style for the less proficient user, as illustrated by the SKOS Taxonomy above. Yet we have not sacrificed the ontological precision of owl:Classes.

Thus we can ask questions about all concepts:

SELECT * WHERE

{

       ?myConcepts rdfs:subClassOf+ skos:Concept .

}

Or we can get a list of anything broader than one of these concepts:

SELECT * WHERE

{

       ?myBroaderConcepts skos:broader s:Model .

}

SKOS+OWL Modeling

Although skos:semanticRelation allows one to link concepts together, this predicate is often too broad when trying to create an ontology that documents specific relations between specific types of concept.

In our VehicleManufacturingScheme we might want to know the following:

  1. isManufacturedBy: which manufacturer manufactures a particular model
  2. operatedBy: which manufacturer operates a particular production facility
  3. performedAt: which plant is the location of a production facility
  4. wasManufacturedAt: which production facility was used to manufacture a particular model

Figure 2: SKOS+OWL model of Relations

 

These predicates can be defined using RDFS as follows:

so:isManufacturerBy

 rdf:type owl:ObjectProperty ;

rdfs:domain s:Model ;

rdfs:range s:Manufacturer ;

rdfs:subPropertyOf skos:semanticRelation

.

so:operatedBy

rdf:type owl:ObjectProperty ;

rdfs:domain s:Production ;

rdfs:range s:Manufacturer ;

rdfs:subPropertyOf skos:semanticRelation

.

Note that the definition of Model, Manufacturer etc. as subClassOf skos:Concept allows us to precisely define the domain and range.

s:Fusion

so:isManufacturerBy s:Ford ;

so:wasManufacturedAt s:Halewood-SmallVehicle ;

.

s:Dagenham-Truck

so:operatedBy s:Ford ;

so:performedAt s:Dagenham ;

.Thus we have used the flexibility of SKOS with the greater modeling precision of RDFS/OWL.

Why?

By building upon the SKOS model, one can ask an expansive question such as what concepts are semantically related to, say, the concept s:Fusion with a simple query:

SELECT * WHERE

{

       s:Fusion  ?p ?y .

       ?p rdfs:subPropertyOf* skos:semanticRelation

}

Yet with the same model we can ask a specific question about a relationship of a specific instance:

SELECT * WHERE

{

       s:Camry so:isManufacturerBy   ?o .

}

SKOS+OWL+PROV Modeling

One of the attractions of SKOS is that a taxonomy can grow organically. One of the problems of SKOS is that a taxonomy can grow organically!

As the taxonomy grows it can be useful to add another layer of structure beyond a catalog of concepts. Many models are not just documenting the ‘state’ of an entity. Instead they are often tracking the actions performed on entities by agents at locations. Thus aligning the core concepts to the Activity, Entity, Agent, and Location classes of the PROV ontology (Lebo, et al.) provides a generic upper-ontology within which to organize the model details.

Figure 3: PROV model

Thus our VehicleManufacturingScheme has each core PROV concept:

  1. Manufacturers: the Agents who manufacture models, and operate plants
  2. Models: the Entities
  3. ProductionLines: the Activities that produce Models on behalf of Manufacturers.
  4. Plants: the Location at which Activities take place, and Agents and Entities are located.

Figure 4: PROV Model

 

s:Production

rdfs:subClassOf prov:Activity ;

.

s:Model

rdfs:subClassOf prov:Entity ;

.

s:Manufacturer

rdfs:subClassOf prov:Organization ;

.

s:Plant

rdfs:subClassOf prov:Location ;

.

Similarly we can cast our predicates into the same PROV model as follows:

so:isManufacturerBy

rdfs:subPropertyOf prov:wasAttributedTo ;

.

so:operatedBy

rdfs:subPropertyOf prov:wasAssociatedWith ;

.

so:performedAt

rdfs:subPropertyOf prov:atLocation ;

.

so:wasManufacturedAt

rdfs:subPropertyOf prov:wasGeneratedBy ;

.

Why?

The PROV model is closer to the requirements of most enterprise models, that are trying to ‘model the business’, than a simple E-R model. The latter concentrates on capturing the attributes of an entity that record the current state of that entity. Often those attributes focus on documenting the process by which the entity gained its current state:

  • The agent that created the entity
  • The activity used to create the entity
  • The location when things were performed
  • The data of the activity, etc

Superimposing the PROV model formalizes this model, and thus allows a structure within which a more casual user can navigate, rather than a sea of entities.

By building upon the PROV model, one can ask an expansive question such as what entities behave as Agents and in which entities are they involved:

SELECT * WHERE

{

?organization a ?Agent .

?Agent rdfs:subClassOf* prov:Agent .

?entity ?predicate ?organization

}

SKOS+OWL+PROV-Qualified Modeling

Within the structure of PROV, predicates define the relationships between Activities, Entities, Agents, and Locations. However it is sometimes necessary to qualify these relationships. For example, the so:wasManufacturedAt predicate defines that a s:Production facility was used to manufacture a s:Model. When? How was it used? Why?

To extend the model, PROV adds the concept of a qualified influence, which allows the relationship to be further defined.

Figure 5: Qualified PROV for some predicates

We do this first of all by creating sopq:Manufacturing:

sopq:Manufacturing

rdf:type owl:Class ;

rdfs:subClassOf skos:Concept ;

rdfs:subClassOf prov:Generation ;

skos:topConceptOf s:VehicleManufacturingScheme ;

.

Note that this is a rdfs:subClassOf prov:Generation, the reification of the predicate prov:wasGeneratedBy

We then add two predicates, one (sopq:wasManufacturedUsing) from the prov:Entity to the prov:Generation, and one (sopq:production) from the prov:Generation to the prov:Activity as follows:

sopq:wasManufacturedUsing

rdf:type owl:ObjectProperty ;

rdfs:domain s:Model ;

rdfs:range sopq:Manufacturing ;

rdfs:subPropertyOf skos:semanticRelation ;

rdfs:subPropertyOf prov:qualifiedGeneration ;

.

sopq:production

rdf:type owl:ObjectProperty ;

rdfs:subPropertyOf skos:semanticRelation ;

rdfs:subPropertyOf prov:activity ;

.

Finally we can create a Manufacturing qualified generation concept as follows:

sopq:L-450H_at_Swindon-Hybrid

rdf:type sopq:Manufacturing ;

sopq:production s:Swindon-Hybrid ;

 skos:broader sopq:Manufacturing ;

.

s:L-450H

sopq:wasManufacturedUsing sopq:L-450H_at_Swindon-Hybrid ;

.

In the figure below we can see that these qualified actions simply extend the SKPOS taxonomy:

Figure 6: Taxonomy extended with Qualified Actions

Why?

Using qualifiedActions provides a systematic, rather than ad-hoc, way to provide more precision to a model.

Remaining Issues

  1. The PROV structure does not manifest itself within the taxonomy. Should Activity, Entity, Agent, and Location therefore be ConceptSchemes?

Model

The model files used in this example are included here: model2rdf

  • skos.ttl
  • skos+owl.ttl
  • skos+owl+prov.ttl
  • skos+owl+provqualified.ttl

References

Bechhofer, Sean and Miles, Alistair. Using OWL and SKOS. [Online] W3C. https://www.w3.org/2006/07/SWD/SKOS/skos-and-owl/master.html.

Lebo, Timothy, Satya, Sahoo and McGuinness, Deborah. PROV-O: The PROV Ontology. W3C. [Online] https://www.w3.org/TR/prov-o/.Reaming Issues