• What is the best route, with the least changes, through the London Underground?
  • Have I unintentionally revealed PII (personally identifiable information) or copyright information in a custom query or report?
  • Who is the closest relative whose alma mater is Harvard?
  • What is the root-cause problem within an IoT/DigitalTwin graph of a process plant?

KnowledgeGraphs are the best way to capture numerous facts about things. Intelligence is the ability to connect individual facts into knowledge. However, uncovering that intelligence so that it can be acted upon to produce results can be challenging. 

IntelligentGraph’s PathQL provides an easy way to uncover that knowledge by describing paths and connections through these facts, as shown by these answers to the above questions. These paths can be visualized, however, PathQL can also reveal how it deduces that path through its trace capabilities.

What is IntelligentGraph?

IntelligentGraph is an extension to any RDF knowledge graph, thus it is applicable to any existing knowledge graphs. As well as providing the PathQL query capability, IntelligentGraph allows analysis formulae to be embedded in the graph as nodes within the KnowledgeGraph. These formulae nodes only get executed when that node is accessed via a query. This is equivalent to moving analysis formulae into the graph, alongside the data, rather than having to move the graph data into the analysis engine.

What is the best route, with the least changes, through the London Underground?

Use Case: An individual wants to find a route through the London Underground from one station to another with the minimum of changes.

Available data: A Knowledge Graph model of the London Underground with its stations, lines, and zones.

Solution:  Accessing the KnowledgeGraph using the IntelligentGraph, and expressing the path query using PathQL as follows obtains the shortest path

This PathQL query will start at the Mornington_Crescent station, transfer to any line, then optionally change lines at up to four stations, until a line going to Oakleigh_Park is found.

Results: One of the shortest routes is to get the Northern line at Mornington Crescent station, change at Old Street to the Great Northern line which goes through Oakleigh station.

Note however that this is just one of the shortest paths. Another is via Moorgate. PathQL will find successively longer routes if required.

Have I unintentionally revealed PII (personally identifiable information) or copyright information in a custom query or report? 

Use Case: An enterprise manages lots of reports based on queries, and those queries might be based on other queries as well as the underlying data. However, some of the underlying data contain personal or copyrighted data that cannot be revealed.

Available data: An ISO 11179-based Knowledge Graph of the database schema (MMS), together with the structure of the database views and reports.

Solution:  Access the KnowledgeGraph using the IntelligentGraph, and express the path query using PathQL as follows to discover any report dependencies on personal or copyright information

This pathQL query will return all paths between any data element that is declared as ‘Sensitive’, and any data element of the report that is being validated to ensure that it is not derived from sensitive information. 

Results: PII violation detected! PathQL discovers that there is a route for the PII information to leak into the Customer_List report.

Who is the closest relative whose alma mater is Harvard?

Use Case: We want to explore relationships, not just finding particular individuals but to explore their connections to particular institutions.

Available data: A genealogical KnowledgeGraph containing persons, genders, colleges, etc.

Solution:  Access the KnowledgeGraph using the IntelligentGraph, and express the path query using PathQL as follows discover the closest relative that went to Harvard

This pathQL query starts with any individual, in this case, ‘Arnold’, then searches through all immediate and indirect relatives until we encounter the first one that went to Harvard. If we continue to explore the paths, PathQL will return successively more distant connections to Harvard.

Results:

What is the root-cause problem within an IoT/DigitalTwin graph of a process plant?

Use Case: A DigitalTwin Knowledge Graph IoT shows an out-of-kilter measurement point. To assist root-cause diagnosis the engineer wants to quickly assemble all relevant information. This means following the process flow path upstream and identifying the equipment and associated measurement points.

Available data: A DigitalTwin KnowledgeGraph contains a PFD/P&ID model with measurement sources, pipes, lines, vessels, valves, and associated connectivity, including the process flow of material. The model used is the  Tennesse -Eastman Plant described by J.J Downs and E.F. Vogel, Computers chem. Engng, Vol17 No 3 pp 245-255. The IoT data is pulled in on-demand by the IntelligentGraph.

Solution:  Access the KnowledgeGraph using the IntelligentGraph, and express the path query using PathQL as follows to discover all relevant measurement points upstream, in a process-flow sense, of the out-of-kilter measurement point. 

PathQL can ‘walk’ back through the process flow any number of steps, starting with the plant item with an attribute provided by the out-of-kilter measurement point. As it walks back through the flowsheet, it identifies the attributes of the plant items it encounters, and for each attribute the associated measurement point that serves as the provider of the attribute. 

Note that PathQL supports both first-order predicate expressions, such as “:aObjectProperty”.
It also supports rdf:Statement reified predicate expressions, such as “@:aReifiedPredicate”, 
where the reification is rdf:Statement.
Additionally it supports rdfs:subClassOf rdf:Statement such as “:Transference@:ProcessFlow”, 
where :Transference is a rdfs:subClassOf rdf:Statement. 

Results: Returns a set of related measurement points, and the process-flow path from the out-of-kilter measurement point, which, since they are all upstream of the process problem, should immediately help diagnose the root cause of that problem.

Availability of IntelligentGraph and PathQL

The IntelligentGraph SAIL offers an extended capability for embedded calculation support within any RDF graph. When enabled as an RDF4J SAIL, it offers calculation functionality as part of the RDF4J engine, on top of any RDF4J repository, using a variety of script engines including JavaScript, Jython, and Groovy. It preserves the SPARQL capability of RDF4J, but with additional capabilities for calculation debugging and tracing. 

IntelligentGraph includes the PathQL query language. Just as a spreadsheet cell calculation needs to access other cells, an IntelligentGraph calculation needs to access other nodes within the graph. Although full access to the underlying graph is available to any of the scripts, PathQL provides a succinct, and efficient method to access directly or indirectly related nodes. PathQL can either return just the contents of the referenced nodes, or the contents and the path to the referenced nodes.

PathQL can also be used standalone to query the IntelligentGraph-enabled RDF database. This supplements, rather than replaces, SPARQL and GraphQL, as it provides graph-path querying rather than graph-pattern querying capabilities to any IntelligentGraph-enabled RDF database.

Leave a Reply