RDF* & SPARQL*
Showing posts with label sparql. Show all posts
Showing posts with label sparql. Show all posts
24 July 2020
5 March 2018
Types of RDF Storage
Native
- Main Memory-based
- Disk-based
- RDBMS
- Schema-based
- Vertical partitioning
- Hierarchical property table
- Property table
- Schema-free
- Triple table
- NoSQL
- Key-value
- Column Family
- Document store
- Graph database
15 January 2018
11 December 2017
Relational to Semantic Mappings
Labels:
big data
,
data science
,
databases
,
intelligent web
,
linked data
,
rdf
,
semantic web
,
sparql
2 December 2017
The Accidental Taxonomist
Labels:
big data
,
data science
,
intelligent web
,
linked data
,
nosql
,
rdf
,
semantic web
,
sparql
Smart Data Lakes
Labels:
big data
,
data science
,
intelligent web
,
linked data
,
natural language processing
,
nosql
,
rdf
,
semantic web
,
sparql
,
text analytics
1 December 2017
20 April 2017
Linked Data Patterns
Labels:
big data
,
Cloud
,
data science
,
distributed systems
,
intelligent web
,
linked data
,
rdf
,
semantic web
,
sparql
Semantic Web Meetup Course
semantic web london
metadataconsulting
Alternatives:
metadataconsulting
- Introduction to Semantic Web standards and Linked data technologies
- Resource Description Framework
- Graph-based data model representation and core concepts
- Terse RDF Triple Language
- Advanced RDF features
- Best practices on publishing RDF data
- RDF Schema (RDFS)
- Discussion of the added value of a schema driven by examples
- Syntax of the core features: classes, properties and their characteristics
- Relationships between RDFS vocabulary elements
- Computing answers to typical queries over RDFS datasets
- Using Protege for modeling and querying RDFS datasets
- Limitations of RDFS
- Querying Semantic Web with SPARQL
- Core concepts
- Basic graph patterns
- Querying datasets with the SPARQL engine StarDog
- Filters and SPARQL expressions
- Property path expressions
- Complex graph patterns with advanced features such as optional parts, aggregation and ordering Other query types
- Updating with SPARQL
- OWL Web Ontology Language
- Core concepts and differences to RDFS
- Overview of OWL modeling constructs
- Modeling and assessing the benefits of alternative models in a particular application context Substitutability of modeling constructs
- Discussion of the trade-off between the expressivity of modeling languages and the computational efficiency of querying
- OWL profiles
- Limitations of the expressive power of OWL
- Applications of Semantic Technologies in Practice
Alternatives:
Labels:
big data
,
data science
,
linked data
,
natural language processing
,
rdf
,
semantic web
,
sparql
22 February 2017
Outstanding Ontologies
There are different types of ontologies ranging from knowledge representation ontologies, domain ontologies, linguistic ontologies, and top-level ontologies. A selection of a few examples from different types are provided below.
Knowledge Representation Ontologies:
Frame Ontology
OKCB
Top-Level Ontologies:
Cyc
SOWA
Standard Upper Ontology
Linguistic Ontologies:
Wordnet
Generalized Upper Model
Sensus
Eurowordnet
Mikrokosmos
Ecommerce Ontologies (Domain Ontology):
United Nations Standards Products and Services Codes
North American Industry Classification System
Standard Classification of Transported Goods
E-Cl@ss
RosettaNet
Medical Ontologies (Domain Ontology):
GALEN
UMLS
ON9
Engineering Ontologies (Domain Ontology):
EngMath
PhysSys
Enterprise Ontologies (Domain Ontology):
Enterprise Ontology
TOVE
Chemistry Ontologies (Domain Ontology):
Chemicals
Ions
Environmental Pollutants
Knowledge Mgmt Ontologies (Domain Ontology):
KA Ontology - Project, Organization, Person, Publication, Event, Research-Topic, Research-Product
Nature.com Subjects Ontologies
Nature.com Subjects Ontologies
17 February 2017
5 September 2016
SKOS
SKOS is a very common data model for representing knowledge in form of thesauri or controlled vocabularies which can provide for interlinked knowledge graphs as a form of linked data. SKOS is a lightweight and flexible OWL ontology representation format available in various RDF syntax. OWL on the other hand is an ontology language. It is possible to convert from SKOS to OWL and even to combine them. The below links provide some related tools and libraries for working with SKOS models.
JSKOS
SKOSAPI
OWLAPI
SKOSEd
OpenSKOS
TemTres
THManager
PoolParty
TopBraid
Thesaurus Master
Lexaurus
Fluent Editor
Intelligent Topic Manager
SKOS2OWL
Protege
SKOSIFY
Poolparty Consistency Checker
KEA
SKOSMOS
SILK
W3C SKOS
SKOS: A Guide for Information Professionals
SKOS Taxonomy
The Accidental Taxonomist
Knowledge Engineering with Semantic Web Technologies
LinkedData Engineering
PoolParty Academy
Gate
Ontotext
Knowledge Extraction
Taxonomy Warehouse
Synaptica
Labels:
data science
,
linked data
,
metadata
,
natural language processing
,
rdf
,
semantic web
,
sparql
,
text analytics
17 May 2016
Graph Comparison
Analytical | |||
---|---|---|---|
Type | Backend | Supported Frameworks | Context of Use |
Giraph | Hadoop/HDFS | Spark/Hadoop | Data Processing for Analytics |
GraphX | Titan, Neo4J, HDFS | Spark | Data Processing for Analytics (in-memory) |
GraphLab | Hadoop/HDFS | Spark/Hadoop | Data Processing for Analytics, using PowerGraph and GAS models |
Operational | |||
---|---|---|---|
Type | Backend | Supported Frameworks | Context of Use |
Cayley | MongoDB or LevelDB | Custom Implementation in Go | Knowledge Graph |
Titan | Cassandra, HBase, HDFS | Tinkerpop & RDF SPARQL | Massive Knowledge Graphs OLAP/OLTP (now part of Datastax) |
Neo4J | Custom | Tinkerpop | Data Visualization, Web Browsing, Portfolio Analytics, Gene Sequencing, Mobile Social Application |
OrientDB | Custom | Tinkerpop & RDF SPARQL | Embedded and Standalone, Knowledge Graph, Multimodel (Document + Graph) |
Semantic | |||
---|---|---|---|
Type | Backend | Supported Frameworks | Context of Use |
Blazegraph and MapGraph | Custom | Sesame RDF SPARQL Tinkerpop | Massive Knowledge Graphs on GPU, includes support for Semantic Web Standards of W3C (used by Wikidata, a Wikimedia project) |
Stardog | Custom | RDF SPARQL | In cloud the semantic data use case (third-party) |
OntoText GraphDB | Custom | Sesame Jena RDF SPARQL | Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C (used by BBC, Euromoney, FinancialTimes, etc) |
Virtuoso | Custom/Hybrid | Sesame Jena RDF SPARQL | Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C (used by DBPedia) |
Allegrograph | Custom | Sesame RDF SPARQL | Optimized as a Semantic Graph Database based on Semantic Web Standards of W3C |
OpenCog | Custom | Semantic Knowledge | Massive Artificial General Intelligence Graph Knowledge Base |
OLTP/Graph Databases
OLTP/Analytical Databases
Graph Database as a Service
Native Semantic Graph Databases
Graph Query / Interfaces
8 November 2014
Semantic Representation
Representation of semantic data is a computationally expensive process with a lot of embedded metadata for building semantically contextual graphs. However, such representation also comes at a storage and processing cost. XML standard has always been a more complete representation option on basis of which other standards have been developed. However, the introduction of JSON-LD provides further options for flexibility. Unfortunately, flexibility of semantic data processing also comes at a cost from loss in fidelity. Representing JSON-LD maybe a plausible option. But, storing the raw form of RDF in XML compatible native form is always favorable. This loss in fidelity may arise during content negotiation and during conversion. But, RDF is quite a memory intensive representation format which requires a separate processing requirements. Even viewing RDF from property graph perspective may not be sufficient. And, utilizing triple stores and even quad stores have always been the best option even of today, while such options still provide issues with vendor lock-in at times. Although, RDF and semantic web have come along way, there is still a lot that can be done both in terms of standardization as well as better distributed semantic graph storage. Semantic integration is again a core aspect of Linked Data requirements which is another aspect that requires more standardization and advancement. JSON-LD appears to be a useful option for front-end client processing in a lightweight integration. It also has some fundamental limitations in comparison to RDF. A question arises as to why the W3C gave up on the idea of RDF/JSON standardization. However, this is a case of what is more important in the semantic web community and for an application context, whether the representation should be in computer readable or human readable form. Nonetheless, the core representation format of semantic web for storage, in most domain contexts, should really be maintained in the native form of RDF/XML and associated derivatives for obvious reasons.
Labels:
big data
,
data science
,
intelligent web
,
linked data
,
nosql
,
rdf
,
semantic web
,
sparql
6 November 2014
Metadata Standards
Library and Book Publishing metadata standards have come a long way and they are still in a state of flux and evolution as cataloging and publishing take on emerging new forms for further standardization and universal interpretation. Data science and Data Mining are also providing new ways of harnessing information and knowledge about classification of both data and content. However, metadata are still the epitome of differentiating and exposing data in all its transformations. XML is often seen as the mainstream format for most metadata standards. However, JSON and RDF have also emerged to break into a strong hold in developing more flexible and universal standard formats. Metadata is categorized in fundamentally three different types: administrative, descriptive, and structural. The following handbook provides further details on the book publishing structures and evolving metadata trends.
Labels:
data science
,
linked data
,
metadata
,
nosql
,
publishing
,
rdf
,
semantic web
,
sparql
,
text analytics
5 October 2014
Semantic Certifications
Getting certified in a particular technology is a debatable topic. For some employers, it could be a plus point for achievement. For others, it holds very little value. And, yet certifications can also get outdated very quickly. Perhaps, getting certified with the concepts is more important than the technology certification that is specific to a version. Semantic Web technologies are slow moving as they go through an extensive specification driven process. They are also rarely taught so formally at university and even for certifications. However, Semantic Web is growing in popularity as industry sees remarkable benefits in contextualizing data and information on the web as well as a wide variety of use cases. Semsphere Certifications is one unique starting point that provides solid grounding in the area with a rigorous exam. The certifications are primarily at two levels which may be of interest to most developers: Specialist and a Professional. The third level is primarily for trainers. The two levels cover broadly the core areas of interest and technologies in Semantic Web:
23 June 2014
EAV vs SPO
Knowledge Representation has been around for a long time, within database architecture, but also within abstractions of domain logic especially in area of analytics. There have been various formalisms defined for representation of such knowledge in order to derive a more structured logic for learning and reasoning from data. Databases have historically provided the persistence layer for many applications and usually hold the key to unlocking a lot of the data today. However, good data representation allows for versatility, performance, and unlocking hidden knowledge. Entity-Attribute-Value (EAV) and Subject-Predicate-Object (SPO) are similar in modelling approaches. In both cases, one is working with relationship by 3-tuples. However, EAV is a subset of SPO. Often, people who are not familiar with SPO and are more comfortable with the relational model design end up utilizing EAV, as an extension. However, in most cases, EAV is seen as an anti-pattern leading to higher development times, poor utilization of data, and undesirably more complex queries. In SPO, everything is treated as a resource and extended in a graph representation. It also allows for better reasoning capabilities for inference. SPO also lends very well to the architecture of the web, utilizing URI schemas, in a linked context, as an extension to the RESTful approach of using standard HTTP methods. On one hand, the EAV tries to build richer metadata semantics through a schema taxonomy, as an extension to the database relational model. While, in SPO the approach is more synonymous to ontologies in form of extensible linked data schema for a domain context, which is not only web friendly, but also machine readable. SPO is also linguistically inspired from natural language.
eav and spo
eav/cr model
rdf
SPO
considerations for eav
considerations for modelling eav for biomedical databases
magento eav
eav talk
understanding linked data via eav model
eav and spo
eav/cr model
rdf
SPO
considerations for eav
considerations for modelling eav for biomedical databases
magento eav
eav talk
understanding linked data via eav model
Labels:
artificial intelligence
,
big data
,
intelligent web
,
linked data
,
natural language processing
,
rdf
,
semantic web
,
sparql
,
text analytics
14 May 2014
Open Annotations
Open Annotations Community is an interesting collaborative group defining the standards for specifications on interoperable and extendable annotations which can be enabled for sharing across multiple application, device, and service domains. The open approach here looks towards maximizing on accessibility with unfettered access and even for the addition of new techniques. At same time, there is compatibility with use of standard approaches of publish/subscribe models. Although, it does not define a specific protocol for such interactions. The seamless effort is designed to work with the simplicity of the displaced architecture of the web. As a semantic web standard the approach towards an annotation is taken from the viewpoint of an RDF graph serialization. The design stipulates for body with one or more targets which can be defined as URI resources. Although, some annotations may not utilize a body. Each resource then has a very distinctive metadata and provenance information with any relevant media type that can be dereferenced. Additional representations can also be defined or resolved as changes arise to resources via content negotiation. There are extensive use cases available for open annotations. The open annotations community is also very active and has draft specification in place for data model.
Labels:
linked data
,
natural language processing
,
publishing
,
rdf
,
semantic web
,
sparql
,
text analytics
Clerezza
Clerezza is an OSGI service approach to building semantically driven web applications. It comes with a rich set of integration points and features which make it aptly useful for building to modular services. There is even a conscious effort towards security management with use of WebID which at times is almost lacking in some frameworks. As most aspects of semantic web processing is layered through a workflow process, building to bundles is often more efficient and useful for seamless integration of components. Such bundles might provide features for RDF/JSON formats for building semantic applications and using standard open technologies for implementations such as Jersey, Felix, Jena, Jetty, and even JQuery. The approach can even be made as a platform with specific compile and runtime requirements. Content management systems have multiple parts for working with aspects of content. Semantic web not only makes content more accessible but also using Clerezza can ease the implementation. There are two aspects to the Clerezza project: semantic web application development as well as the RDF storage and manipulation. The core implementations of Clerezza have been engineered in Scala language and provides for use of renderlets which are defined as part of ScalaServerPages for creating various representations. The approach follows the W3C RDF specification and triples are stored using smart content binding which is a versatile layer and very much agnostic to technology providing for both access and modification. The smart content binding also makes use of named graphs to facilitate operations on the data model as well as options to access multiple domain graphs. There are also various adaptors available for processing of RDF graphs. Lastly, the smart content binding provides serialization and parsing services for various conversions and representations. Although, the project does try to provide a very seamless approach, one of the core drawbacks to the initiatives has been in the lack of documentation which makes the stack difficult to understand as well as the use cases for implementation. There have been some efforts made towards improvement in this area and the project is actively in development. One very interesting integration convention is between UIMA/Clerezza for textual annotations using the Annotations Ontology. One can refer further to this on the Domeo Annotations Toolkit paper or the slideshare.
Labels:
intelligent web
,
Java
,
linked data
,
natural language processing
,
rdf
,
scala
,
semantic web
,
sparql
,
text analytics
26 March 2014
Semantic Annotations
Semantic annotations is a broad and complex area often requiring a mixture of natural language processing as well as knowledge representation. One of the major inherent requirements in an application is to provide for word sense disambiguation. There are also more light weight approaches that generalize on the semantics alone in form of ontologies especially for maintaining publications and cataloging. Such semantics can cater for both text as well as multimedia. What this enables is that semantic labels can be constructed in context and provided for findability, better visualization, reasoning over a set of web resources, and allowing for the conversion from syntactic to knowledge structures. One can approach this manually or in an automated fashion. The manual step often takes the typical approach of transforming of syntactic resources into interlinks of knowledge without taking account of much in way of multiple perspectives of data sources, and which is applied using third-party tools. There is also the approach of utilizing semi-automated annotations. Even though, they also require human intervention at various phases of the process. GATE is one such semi-automated tool for extracting entity sets. Automated approaches usually require tuning and re-tuning after training. They can get their knowledge from the web and apply it to content in a context-driven manner for automatic extraction and annotation. Wrappers are created that can identify and recognize patterns in text for annotations. While at times, they may be human assisted. They may approach using various classifiers as a supervised way of learning patterns. For annotation of multimedia, this often takes the approach of rich metadata. Alternatively, it could be more in way of content semantics or even granular to the multimedia. Annotations could be global, collaborative, and even local. One could extend and provide rich annotations using custom metadata that could be variously defined through controlled vocabularies, taxonomies, ontologies, topic maps, and thesauri for different contexts. There is even a W3C effort for open annotations as well as the LRMI effort based on schema.org as a learning resources initiative. One could even build a pipeline approach through the various workflow stages of filtering process for content using UIMA. And, even as a CMS approach similar to Apache Stanbol. Standard tools like Tika, Solr, OpenNLP, Kea, can also be useful. Often languages like Java, Groovy, Python, XML, RDF, OWL, are used for implementations and rich textual semantics. However, increasingly tools are emerging on Scala as well.
Labels:
intelligent web
,
linked data
,
natural language processing
,
rdf
,
semantic web
,
sparql
,
text analytics
Subscribe to:
Posts
(
Atom
)