18 September 2025

Redundancy of Taxonomists and Ontologists

The proliferation of data-driven initiatives has led many organizations to consider specialized roles for managing complex knowledge structures, such as taxonomists and ontologists. However, for organizations already proficient in relational database design, the creation of these separate roles is often an unnecessary and inefficient overhead. The core competencies required for building knowledge graphs are not entirely new; rather, they represent a conceptual extension of existing data modeling skills, supplemented by an understanding of a few key standards and tools.

At its heart, a knowledge graph is a data model. The established practice of designing relational databases, which relies on Entity-Relationship (E-R) and UML diagrams, provides an excellent foundation. In this context, an entity in a traditional E-R diagram is conceptually analogous to a class in an ontology, and the relationships between entities directly correspond to the properties that define connections in a graph. The process of normalization and schema design for a relational database is a form of data abstraction aimed at reducing redundancy and ensuring consistency, a goal identical to that of ontology engineering. A skilled data modeler is therefore already engaged in the intellectual work of identifying key concepts and their interdependencies, laying the conceptual groundwork for a semantic model.

The primary conceptual hurdle is the shift from the Closed World Assumption (CWA) to the Open World Assumption (OWA), but this is a change in perspective, not a complete paradigm shift. Relational databases operate under CWA, assuming that any fact not explicitly stored in the database is false. Semantic technologies, however, embrace OWA, which treats an unstated fact as simply unknown. An existing data team can be educated on this distinction, which is less about learning a new field and more about adopting a different logical framework. This process is essentially a form of reverse engineering the existing closed-world model into an open-world paradigm, a task well within the grasp of data professionals. Coupled with an understanding of W3C standards like RDF and OWL, this conceptual bridge transforms a data modeler into a foundational ontologist.

Furthermore, the practical conversion from a relational schema to a knowledge graph can be largely automated with modern tools. Technologies such as RML (a variant of R2RML) allow developers to declaratively map data from relational tables into a graph format (RDF). Following this, a validation language like SHACL can be used to ensure the graph's structure and data integrity align with the defined ontology. The skill required here is not a deep, specialized knowledge of linguistics or philosophy, but a technical aptitude for understanding schemas, writing mapping rules, and using software tools. This positions the task squarely within the purview of a data engineer or architect, whose expertise in data pipelines and system architecture is far more valuable than a specialized, single-purpose role.

For most enterprises, the intellectual and technical prerequisites for building and maintaining knowledge graphs are already present within their data teams. The core skills are not in a niche specialization, but in a deep understanding of the business and its data. By focusing on training their existing staff in a few critical concepts—such as the difference between closed and open world assumptions—and providing access to declarative mapping and validation tools, organizations can leverage their existing expertise. This approach not only saves the cost and effort of hiring highly specialized personnel but also fully integrates knowledge graph initiatives into the broader data strategy, ensuring they are not treated as isolated projects.