{"id":916,"date":"2017-04-27T13:44:05","date_gmt":"2017-04-27T18:44:05","guid":{"rendered":"http:\/\/inova8.com\/bg_inova8.com\/?p=916"},"modified":"2019-04-23T02:30:33","modified_gmt":"2019-04-23T07:30:33","slug":"linked-enterprise-data-led-how-to-create-an-information-shopping-bazaar","status":"publish","type":"post","link":"https:\/\/inova8.com\/bg_inova8.com\/linked-enterprise-data-led-how-to-create-an-information-shopping-bazaar\/","title":{"rendered":"Linked Enterprise Data (LED): how to create an Information Shopping Bazaar"},"content":{"rendered":"<div class=\"boldgrid-section\">\n<div class=\"container\">\n<div class=\"row\">\n<div class=\"col-md-12 col-xs-12 col-sm-12\">\n<h1>Integration Problem to be solved<\/h1>\n<ul>\n<li>Data in different databases, even with Linked Open data sources.<\/li>\n<li>Misaligned models, different datasets have different meanings for classes and predicates that need to be aligned.<\/li>\n<li>Misaligned names for the same concepts.<\/li>\n<li>Replication is problematical.<\/li>\n<li>Query definition and scope of querying difficult to define in advance.<\/li>\n<li>Provence of data necessary.<\/li>\n<li>Cannot depend on inferences being available in advance<\/li>\n<li>Scalable architecture requires that all queries are stateless<\/li>\n<\/ul>\n<h1>Data Cathedrals versus Information Shopping Bazaars<\/h1>\n<p>Linked Open Data has been growing since 2007 from a few (12) interconnected datasets to 295 as of 2011, and it continues to grow. To quote \u201cLinked Data is about using the Web to connect related data that wasn&#8217;t previously linked, or using the Web to lower the barriers to linking data currently linked using other methods.\u201d&nbsp;(Linked Data, n.d.)&nbsp;<\/p>\n<p><img loading=\"lazy\" class=\" wp-image-921 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-1.png\" alt=\"\" width=\"834\" height=\"374\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-1.png 5685w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-1-300x135.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-1-768x345.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-1-1024x459.png 1024w\" sizes=\"(max-width: 834px) 100vw, 834px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 1: Growth of the Linked Data &#8216;Cloud&#8217;<\/p>\n<p>As impressive as the growth of interconnected datasets is, what is more important is the value of that interconnected data. A corollary of Metcalf\u2019s law suggests that the benefit gained from integrated information grows geometrically<a href=\"#_ftn1\" name=\"_ftnref1\">[1]<\/a>&nbsp;with the number of data communities that are integrated.<\/p>\n<p>Many organizations have their own icebergs of information: operations, sales, marketing, personnel, legal, finance, research, maintenance, CRM, document vaults etc.&nbsp;(Lawrence, 2012) Over the years there have been various attempts to melt the boundaries between these icebergs including the creation of the mother-of-all databases that houses (or replicates) all information or the replacement of disparate applications with their own database with a mother-of-all application that eliminates the separate databases. Neither of these has really succeeded in unifying any or all data within an organization.&nbsp;(Lawrence, Data cathedrals versus information bazaars?, 2012). The result is a \u2018Data Cathedral\u2019 through which users have no way to navigate to find the information that will answer their questions.<\/p>\n<p><img loading=\"lazy\" class=\"wp-image-922 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-2.png\" alt=\"\" width=\"603\" height=\"676\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-2.png 3311w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-2-268x300.png 268w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-2-768x860.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-2-914x1024.png 914w\" sizes=\"(max-width: 603px) 100vw, 603px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 2: Users have no way to navigate through the Enterprise\u2019s Data Cathedral<\/p>\n<h1>Remediator at the heart of Linked Enterprise Data<\/h1>\n<p>Can we create an information shopping bazaar for users to answer their questions without committing heresy in the Data Cathedral?&nbsp; Can we create the same information shopping bazaar as Linked Data within the Enterprise: Linked Enterprise Data (LED). That is the objective of Remediator.<\/p>\n<p>First of all we must recognize that the enterprise will have many structured, aggregated, and unstructured data stores already in place:<\/p>\n<p><img loading=\"lazy\" class=\"wp-image-923 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-3.png\" alt=\"\" width=\"587\" height=\"307\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-3.png 4718w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-3-300x157.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-3-768x401.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-3-1024x535.png 1024w\" sizes=\"(max-width: 587px) 100vw, 587px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 3: Enterprise Structured, Aggregated, and Unstructured Data Icebergs<\/p>\n<p>One of the keys to the ability of Linked Data to interlink 300+ datasets is that they are all are expressed as RDF. The enterprise does not have the luxury of replicating all existing data into RDF datasets. However that is not necessary (although still sometimes desirable) because there are adapters that can make any existing dataset look as if it contains RDF, and can be accessed via a SPARQLEndpoint. Examples are listed below<\/p>\n<ol>\n<li>D2RQ: (D2RQ: Accessing Relational Databases as Virtual RDF Graphs )<\/li>\n<li>Ultrawrap:(Research in Bioinformatics and Semantic Web\/Ultrawrap)<\/li>\n<li>Ontop:(-ontop- is a platform to query databases as Virtual RDF Graphs using SPARQ)<\/li>\n<\/ol>\n<p>Attaching these adapters to existing data-stores, or replicating existing data into a triple store, takes us one step further to the Linked Enterprise Data:<\/p>\n<p><img loading=\"lazy\" class=\"wp-image-924 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-4.png\" alt=\"\" width=\"602\" height=\"314\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-4.png 4719w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-4-300x157.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-4-768x401.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-4-1024x534.png 1024w\" sizes=\"(max-width: 602px) 100vw, 602px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 4: Enterprise Data Cloud, the first step to integration<\/p>\n<p>Of course now that we have harmonized the data all as RDF accessible via a SPARQLEndpoint we can view this as an extension of the Linked Data cloud in which we provide enterprises users access to both enterprise and public data:<\/p>\n<p><img loading=\"lazy\" class=\" wp-image-925 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-5.png\" alt=\"\" width=\"1001\" height=\"660\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-5.png 5848w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-5-300x198.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-5-768x507.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-5-1024x675.png 1024w\" sizes=\"(max-width: 1001px) 100vw, 1001px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 5: Enterprise Data Cloud and Linked Data cloud<\/p>\n<p>We are now closer to the information shopping bazaar, since users would, given appropriate discovery and searching user interfaces, be able to navigate their own way through this data cloud.&nbsp; However, despite the harmonization of the data into RDF, we still have not provided a means for users ask new questions:<\/p>\n<table style=\"height: 5px;\" width=\"611\">\n<tbody style=\"padding-left: 60px;\">\n<tr>\n<td>\n<p style=\"text-align: left;\">What <strong>Company<\/strong> (and their fiscal structure) are we working with that have a <strong>Business Practise <\/strong>of type <strong>Maintenance<\/strong> for the target industry of <strong>Oil and Gas <\/strong>with a supporting technology based on <strong>Vendor-Application<\/strong> and this <strong>Application<\/strong> is not similar to any of our <strong>Application<\/strong>?<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Such questions require pulling information from many different sources within an organization. Even with the Enterprise Data Cloud one has provided the capability to discover such answers. Would it not be better to allow a user to ask such a question, and let the Linked Enterprise Data determine from where it should pull partial answers which it can then aggregate into the complete answer to the question. It is like asking a team of people to answer a complex question, each contributing their own, and then assembling the overall answer rather than relying on a single guru.&nbsp; Remediator has the role of that team, taking parts of the questions and asking that part of the question of the data-sources.<\/p>\n<p><img loading=\"lazy\" class=\" wp-image-919 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-6.png\" alt=\"\" width=\"754\" height=\"552\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-6.png 5959w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-6-300x220.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-6-768x563.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-6-1024x751.png 1024w\" sizes=\"(max-width: 754px) 100vw, 754px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 6: Remediator as the Common Entry Point to Linked Enterprise Data (LED)<\/p>\n<p>Thus our question can become:<\/p>\n<table style=\"height: 140px;\" width=\"682\">\n<tbody>\n<tr>\n<td>\n<ol>\n<li>What <strong>Business Practise <\/strong>of type <strong>Maintenance<\/strong> for the target industry of <strong>Oil and Gas<\/strong>?<\/li>\n<li>What <strong>Company<\/strong> are we working with?<\/li>\n<li>What <strong>Company<\/strong> have a <strong>Business Practise <\/strong>of type <strong>Maintenance<\/strong>?<\/li>\n<li>What <strong>Business Practise <\/strong>with a supporting technology based on <strong>Vendor- Application<\/strong>?<\/li>\n<li>What <strong>Company<\/strong> (and their fiscal structure)?<\/li>\n<li>What <strong>Vendor-Application<\/strong> and this <strong>Application <\/strong>is not similar to any of our <strong>Application<\/strong>?<\/li>\n<\/ol>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This decomposition of a question into sub-questions relevant to each dataset is automated by Remediator:<\/p>\n<p><img loading=\"lazy\" class=\"wp-image-920 aligncenter\" src=\"http:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-7.png\" alt=\"\" width=\"838\" height=\"607\" srcset=\"https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-7.png 5959w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-7-300x217.png 300w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-7-768x556.png 768w, https:\/\/inova8.com\/bg_inova8.com\/wp-content\/uploads\/2017\/04\/linked-enterprise-data.figure-7-1024x742.png 1024w\" sizes=\"(max-width: 838px) 100vw, 838px\" \/><\/p>\n<p style=\"text-align: center;\">Figure 7: Sub-Questions distributed to datasets for answers<\/p>\n<h1>Requirements for a Linked Enterprise Data Architecture<\/h1>\n<ul>\n<li>Keep it simple<\/li>\n<li>Do not re-invent that which already exists.<\/li>\n<li>Eliminate replication where possible.<\/li>\n<li>Avoid the need for prior inferencing.<\/li>\n<li>Efficient query performance.<\/li>\n<li>Provide provenance of results.<\/li>\n<li>Provide optional caching for further slicing and dicing of result-set.<\/li>\n<li>Use Void only Void and nothing but Void to drive the query<\/li>\n<\/ul>\n<p style=\"padding-left: 30px;\"><a href=\"#_ftnref1\" name=\"_ftn1\">[1]<\/a> &nbsp;If I have 10 database systems running my business that are entirely disconnected, then the benefits are 10 * K, where K is some constant. If I integrate these databases in pairs (operations + accounting, accounting + payroll, etc), then the benefits increase to 10 * K * 2. If I integrate in threes, (operations + accounting + maintenance, accounting + payroll + receiving, etc), then the benefits increase four-fold (a corollary of Metcalf&#8217;s law) to 10 * K * 4. For quad-wise integration my benefits would be 10 * K * 8 and so on. Now it might not be 8 fold but the point is there is a geometric, not linear, growth in benefits as I integrate all of my information across my organization.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Integration Problem to be solved Data in different databases, even with Linked Open data sources. Misaligned models, different datasets have different meanings for classes and predicates that need to be aligned. Misaligned names for the same concepts. Replication is problematical. Query definition and scope of querying difficult to define in advance. Provence of data necessary. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"bgseo_title":"Linked Enterprise Data (LED): how to create an Information Shopping Bazaar","bgseo_description":"","bgseo_robots_index":"index","bgseo_robots_follow":"follow","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"categories":[21,19,15,18],"tags":[48,43,41,40,49,35,44,38,42,36,37,39,45,34,47,46],"_links":{"self":[{"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/posts\/916"}],"collection":[{"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/comments?post=916"}],"version-history":[{"count":5,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/posts\/916\/revisions"}],"predecessor-version":[{"id":928,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/posts\/916\/revisions\/928"}],"wp:attachment":[{"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/media?parent=916"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/categories?post=916"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/inova8.com\/bg_inova8.com\/wp-json\/wp\/v2\/tags?post=916"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}