Mabble Rabble: sparql

Introduction to Semantic Web standards and Linked data technologies
Resource Description Framework
Graph-based data model representation and core concepts
Terse RDF Triple Language
Advanced RDF features
Best practices on publishing RDF data
RDF Schema (RDFS)
Discussion of the added value of a schema driven by examples
Syntax of the core features: classes, properties and their characteristics
Relationships between RDFS vocabulary elements
Computing answers to typical queries over RDFS datasets
Using Protege for modeling and querying RDFS datasets
Limitations of RDFS
Querying Semantic Web with SPARQL
Core concepts
Basic graph patterns
Querying datasets with the SPARQL engine StarDog
Filters and SPARQL expressions
Property path expressions
Complex graph patterns with advanced features such as optional parts, aggregation and ordering Other query types
Updating with SPARQL
OWL Web Ontology Language
Core concepts and differences to RDFS
Overview of OWL modeling constructs
Modeling and assessing the benefits of alternative models in a particular application context Substitutability of modeling constructs
Discussion of the trade-off between the expressivity of modeling languages and the computational efficiency of querying
OWL profiles
Limitations of the expressive power of OWL
Applications of Semantic Technologies in Practice

Alternatives:

futurelearn - linked data

hpi - semantic web technologies

mooc list - semantic-web

cambridge semantics - semantic university

22 February 2017

There are different types of ontologies ranging from knowledge representation ontologies, domain ontologies, linguistic ontologies, and top-level ontologies. A selection of a few examples from different types are provided below.

Knowledge Representation Ontologies:

Frame Ontology

OKCB

Top-Level Ontologies:

Cyc

SOWA

Standard Upper Ontology

Linguistic Ontologies:

Wordnet

Generalized Upper Model

Sensus

Eurowordnet

Mikrokosmos

Ecommerce Ontologies (Domain Ontology):

United Nations Standards Products and Services Codes

North American Industry Classification System

Standard Classification of Transported Goods

E-Cl@ss

RosettaNet

Medical Ontologies (Domain Ontology):

GALEN

UMLS

ON9

Engineering Ontologies (Domain Ontology):

EngMath

PhysSys

Enterprise Ontologies (Domain Ontology):

Enterprise Ontology

TOVE

Chemistry Ontologies (Domain Ontology):

Chemicals

Ions

Environmental Pollutants

Knowledge Mgmt Ontologies (Domain Ontology):

KA Ontology - Project, Organization, Person, Publication, Event, Research-Topic, Research-Product

Nature.com Subjects Ontologies

17 February 2017

Blazegraph

Blazegraph - Semantic Graph Database / Triplestore
Technology To Watch For 2016

5 September 2016

SKOS

SKOS is a very common data model for representing knowledge in form of thesauri or controlled vocabularies which can provide for interlinked knowledge graphs as a form of linked data. SKOS is a lightweight and flexible OWL ontology representation format available in various RDF syntax. OWL on the other hand is an ontology language. It is possible to convert from SKOS to OWL and even to combine them. The below links provide some related tools and libraries for working with SKOS models.

JSKOS
SKOSAPI
OWLAPI
SKOSEd
OpenSKOS
TemTres
THManager
PoolParty
TopBraid
Thesaurus Master
Lexaurus
Fluent Editor
Intelligent Topic Manager
SKOS2OWL
Protege
SKOSIFY
Poolparty Consistency Checker
KEA
SKOSMOS
SILK

W3C SKOS
SKOS: A Guide for Information Professionals
SKOS Taxonomy
The Accidental Taxonomist
Knowledge Engineering with Semantic Web Technologies
LinkedData Engineering
PoolParty Academy
Gate
Ontotext
Knowledge Extraction
Taxonomy Warehouse
Synaptica

17 May 2016

Graph Comparison

Analytical
Type	Backend	Supported Frameworks	Context of Use
Giraph	Hadoop/HDFS	Spark/Hadoop	Data Processing for Analytics
GraphX	Titan, Neo4J, HDFS	Spark	Data Processing for Analytics (in-memory)
GraphLab	Hadoop/HDFS	Spark/Hadoop	Data Processing for Analytics, using PowerGraph and GAS models

Operational
Type	Backend	Supported Frameworks	Context of Use
Cayley	MongoDB or LevelDB	Custom Implementation in Go	Knowledge Graph
Titan	Cassandra, HBase, HDFS	Tinkerpop & RDF SPARQL	Massive Knowledge Graphs OLAP/OLTP (now part of Datastax)
Neo4J	Custom	Tinkerpop	Data Visualization, Web Browsing, Portfolio Analytics, Gene Sequencing, Mobile Social Application
OrientDB	Custom	Tinkerpop & RDF SPARQL	Embedded and Standalone, Knowledge Graph, Multimodel (Document + Graph)

Semantic
Type	Backend	Supported Frameworks	Context of Use
Blazegraph and MapGraph	Custom	Sesame RDF SPARQL Tinkerpop	Massive Knowledge Graphs on GPU, includes support for Semantic Web Standards of W3C (used by Wikidata, a Wikimedia project)
Stardog	Custom	RDF SPARQL	In cloud the semantic data use case (third-party)
OntoText GraphDB	Custom	Sesame Jena RDF SPARQL	Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C (used by BBC, Euromoney, FinancialTimes, etc)
Virtuoso	Custom/Hybrid	Sesame Jena RDF SPARQL	Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C (used by DBPedia)
Allegrograph	Custom	Sesame RDF SPARQL	Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C
OpenCog	Custom	Semantic Knowledge	Massive Artificial General Intelligence Graph Knowledge Base

wikidata graph comparison

OLTP/Graph Databases
OLTP/Analytical Databases
Graph Database as a Service
Native Semantic Graph Databases
Graph Query / Interfaces

8 November 2014

Semantic Representation

Representation of semantic data is a computationally expensive process with a lot of embedded metadata for building semantically contextual graphs. However, such representation also comes at a storage and processing cost. XML standard has always been a more complete representation option on basis of which other standards have been developed. However, the introduction of JSON-LD provides further options for flexibility. Unfortunately, flexibility of semantic data processing also comes at a cost from loss in fidelity. Representing JSON-LD maybe a plausible option. But, storing the raw form of RDF in XML compatible native form is always favorable. This loss in fidelity may arise during content negotiation and during conversion. But, RDF is quite a memory intensive representation format which requires a separate processing requirements. Even viewing RDF from property graph perspective may not be sufficient. And, utilizing triple stores and even quad stores have always been the best option even of today, while such options still provide issues with vendor lock-in at times. Although, RDF and semantic web have come along way, there is still a lot that can be done both in terms of standardization as well as better distributed semantic graph storage. Semantic integration is again a core aspect of Linked Data requirements which is another aspect that requires more standardization and advancement. JSON-LD appears to be a useful option for front-end client processing in a lightweight integration. It also has some fundamental limitations in comparison to RDF. A question arises as to why the W3C gave up on the idea of RDF/JSON standardization. However, this is a case of what is more important in the semantic web community and for an application context, whether the representation should be in computer readable or human readable form. Nonetheless, the core representation format of semantic web for storage, in most domain contexts, should really be maintained in the native form of RDF/XML and associated derivatives for obvious reasons.

6 November 2014

Metadata Standards

Library and Book Publishing metadata standards have come a long way and they are still in a state of flux and evolution as cataloging and publishing take on emerging new forms for further standardization and universal interpretation. Data science and Data Mining are also providing new ways of harnessing information and knowledge about classification of both data and content. However, metadata are still the epitome of differentiating and exposing data in all its transformations. XML is often seen as the mainstream format for most metadata standards. However, JSON and RDF have also emerged to break into a strong hold in developing more flexible and universal standard formats. Metadata is categorized in fundamentally three different types: administrative, descriptive, and structural. The following handbook provides further details on the book publishing structures and evolving metadata trends.

the metadata handbook

BISG
BIC
BNC

5 October 2014

Semantic Certifications

Getting certified in a particular technology is a debatable topic. For some employers, it could be a plus point for achievement. For others, it holds very little value. And, yet certifications can also get outdated very quickly. Perhaps, getting certified with the concepts is more important than the technology certification that is specific to a version. Semantic Web technologies are slow moving as they go through an extensive specification driven process. They are also rarely taught so formally at university and even for certifications. However, Semantic Web is growing in popularity as industry sees remarkable benefits in contextualizing data and information on the web as well as a wide variety of use cases. Semsphere Certifications is one unique starting point that provides solid grounding in the area with a rigorous exam. The certifications are primarily at two levels which may be of interest to most developers: Specialist and a Professional. The third level is primarily for trainers. The two levels cover broadly the core areas of interest and technologies in Semantic Web:

RDF

OWL

iServe
Protege
Pellet

Semsphere Specialist

Semsphere Professional

Semsphere Academy Trainer

23 June 2014

EAV vs SPO

Knowledge Representation has been around for a long time, within database architecture, but also within abstractions of domain logic especially in area of analytics. There have been various formalisms defined for representation of such knowledge in order to derive a more structured logic for learning and reasoning from data. Databases have historically provided the persistence layer for many applications and usually hold the key to unlocking a lot of the data today. However, good data representation allows for versatility, performance, and unlocking hidden knowledge. Entity-Attribute-Value (EAV) and Subject-Predicate-Object (SPO) are similar in modelling approaches. In both cases, one is working with relationship by 3-tuples. However, EAV is a subset of SPO. Often, people who are not familiar with SPO and are more comfortable with the relational model design end up utilizing EAV, as an extension. However, in most cases, EAV is seen as an anti-pattern leading to higher development times, poor utilization of data, and undesirably more complex queries. In SPO, everything is treated as a resource and extended in a graph representation. It also allows for better reasoning capabilities for inference. SPO also lends very well to the architecture of the web, utilizing URI schemas, in a linked context, as an extension to the RESTful approach of using standard HTTP methods. On one hand, the EAV tries to build richer metadata semantics through a schema taxonomy, as an extension to the database relational model. While, in SPO the approach is more synonymous to ontologies in form of extensible linked data schema for a domain context, which is not only web friendly, but also machine readable. SPO is also linguistically inspired from natural language.

eav and spo
eav/cr model
rdf
SPO
considerations for eav
considerations for modelling eav for biomedical databases
magento eav
eav talk
understanding linked data via eav model

14 May 2014

Open Annotations

Open Annotations Community is an interesting collaborative group defining the standards for specifications on interoperable and extendable annotations which can be enabled for sharing across multiple application, device, and service domains. The open approach here looks towards maximizing on accessibility with unfettered access and even for the addition of new techniques. At same time, there is compatibility with use of standard approaches of publish/subscribe models. Although, it does not define a specific protocol for such interactions. The seamless effort is designed to work with the simplicity of the displaced architecture of the web. As a semantic web standard the approach towards an annotation is taken from the viewpoint of an RDF graph serialization. The design stipulates for body with one or more targets which can be defined as URI resources. Although, some annotations may not utilize a body. Each resource then has a very distinctive metadata and provenance information with any relevant media type that can be dereferenced. Additional representations can also be defined or resolved as changes arise to resources via content negotiation. There are extensive use cases available for open annotations. The open annotations community is also very active and has draft specification in place for data model.

Clerezza

Clerezza is an OSGI service approach to building semantically driven web applications. It comes with a rich set of integration points and features which make it aptly useful for building to modular services. There is even a conscious effort towards security management with use of WebID which at times is almost lacking in some frameworks. As most aspects of semantic web processing is layered through a workflow process, building to bundles is often more efficient and useful for seamless integration of components. Such bundles might provide features for RDF/JSON formats for building semantic applications and using standard open technologies for implementations such as Jersey, Felix, Jena, Jetty, and even JQuery. The approach can even be made as a platform with specific compile and runtime requirements. Content management systems have multiple parts for working with aspects of content. Semantic web not only makes content more accessible but also using Clerezza can ease the implementation. There are two aspects to the Clerezza project: semantic web application development as well as the RDF storage and manipulation. The core implementations of Clerezza have been engineered in Scala language and provides for use of renderlets which are defined as part of ScalaServerPages for creating various representations. The approach follows the W3C RDF specification and triples are stored using smart content binding which is a versatile layer and very much agnostic to technology providing for both access and modification. The smart content binding also makes use of named graphs to facilitate operations on the data model as well as options to access multiple domain graphs. There are also various adaptors available for processing of RDF graphs. Lastly, the smart content binding provides serialization and parsing services for various conversions and representations. Although, the project does try to provide a very seamless approach, one of the core drawbacks to the initiatives has been in the lack of documentation which makes the stack difficult to understand as well as the use cases for implementation. There have been some efforts made towards improvement in this area and the project is actively in development. One very interesting integration convention is between UIMA/Clerezza for textual annotations using the Annotations Ontology. One can refer further to this on the Domeo Annotations Toolkit paper or the slideshare.

26 March 2014

Semantic Annotations

Semantic annotations is a broad and complex area often requiring a mixture of natural language processing as well as knowledge representation. One of the major inherent requirements in an application is to provide for word sense disambiguation. There are also more light weight approaches that generalize on the semantics alone in form of ontologies especially for maintaining publications and cataloging. Such semantics can cater for both text as well as multimedia. What this enables is that semantic labels can be constructed in context and provided for findability, better visualization, reasoning over a set of web resources, and allowing for the conversion from syntactic to knowledge structures. One can approach this manually or in an automated fashion. The manual step often takes the typical approach of transforming of syntactic resources into interlinks of knowledge without taking account of much in way of multiple perspectives of data sources, and which is applied using third-party tools. There is also the approach of utilizing semi-automated annotations. Even though, they also require human intervention at various phases of the process. GATE is one such semi-automated tool for extracting entity sets. Automated approaches usually require tuning and re-tuning after training. They can get their knowledge from the web and apply it to content in a context-driven manner for automatic extraction and annotation. Wrappers are created that can identify and recognize patterns in text for annotations. While at times, they may be human assisted. They may approach using various classifiers as a supervised way of learning patterns. For annotation of multimedia, this often takes the approach of rich metadata. Alternatively, it could be more in way of content semantics or even granular to the multimedia. Annotations could be global, collaborative, and even local. One could extend and provide rich annotations using custom metadata that could be variously defined through controlled vocabularies, taxonomies, ontologies, topic maps, and thesauri for different contexts. There is even a W3C effort for open annotations as well as the LRMI effort based on schema.org as a learning resources initiative. One could even build a pipeline approach through the various workflow stages of filtering process for content using UIMA. And, even as a CMS approach similar to Apache Stanbol. Standard tools like Tika, Solr, OpenNLP, Kea, can also be useful. Often languages like Java, Groovy, Python, XML, RDF, OWL, are used for implementations and rich textual semantics. However, increasingly tools are emerging on Scala as well.

Subscribe to: Posts ( Atom )

24 July 2020

5 March 2018

15 January 2018

11 December 2017

2 December 2017

1 December 2017

20 April 2017