Mabble Rabble

26 April 2021

Five Phases of AI Project

Definition and Hypothesis (business problem cases and identify value targets)
Data Acquisition and Exploration
Model Building Pipeline and Evaluation
Interpretation and Communication
Automation and Deployment Operations

8 April 2021

1 April 2021

31 March 2021

Three Approaches to Word Similarity Measures

Geometric/Spatial to evaluate relative positions of two words in semantic space defined as context vectors
Set-based that relies on analysis of overlap of the set of contexts in which words occur
Probabilistic using probabilistic models and measures as proposed in information theory

23 March 2021

17 March 2021

TDD and BDD for Ontology Modelling

The end goal of most ontologies is to meet the semantic representation in a specific context of generalization that follows the open-world assumption. However, such model building approaches can manifest in different ways. One approach is to apply test-driven development and behavior-driven development techniques towards building domain-level ontologies where constraint-based testing can be applied as part of the process. The process steps are elaborated below.

Create a series of high-level question/answering requirements which can be defined in form of specification by example
Create SHACL/SHEX tests as granular to individual specification examples in context. Each, SHACL/SHEX validation basically tests 'ForSome' case as part of predicate logic per defined question where subset of domain/ranges can be tested.
Create BDD based acceptance tests and programmatic unit tests that can test logic constraints
At this stage all tests fail. In order to make them pass implement 'ForSome' closed-word assumption defined in the SHACL/SHEX validation i.e implement the representation so SPARQL query can answer the given contextual question for subset cases. Then make the test pass.
Keep repeating the test-implement-refactor stages until all tests pass the given set of constraints. Incrementally, refactor the representation ontology. The refactoring is more about building working generalizations that can transform the closed-world assumption of asserted facts to the partial open-world assumption of unknowns for the entire set.
Finally, when all tests pass, refactor the entire ontology solution so it conforms to the open-world assumption for the entire set i.e 'ForAll, there exists' which can further be tested using SPARQL against the subsumption hypothesis.
If the ontology needs to be integrated with other ontologies build a set of specification by examples for that and implement a set of integration tests in a similar manner.
Furthermore, in any given question/answer case identify topical keywords that provide bounded constraints for a separate ontology initiative, it maybe helpful here to apply natural language processing techniques in order to utilize entity linkage for reuse.
All tests and implementations can be engineered so it follows best practices for maintainability, extensibility, and readability. The tests can be wired through a continuous integration and a maintainable live documentation process.
Expose the ontology as a SPARQL API endpoint
Apply release and versioning process to your ontologies that complies with the W3C process
It is easier to go from a set of abstractions in a closed-world assumption to an open-world assumption than from an open-world assumption to a closed-world assumption. One can use a similar metaphor of going from relational to graph vs graph to relational in context.
Focus on making ontologies accessible to users
OWA is all about incomplete information and the ability to infer on new information, constraint-based testing may not be exhaustive in the search space, but one can try to test against a subsumption hypothesis

Mabble Rabble

26 April 2021

Five Phases of AI Project

8 April 2021

Texas Data Fest

Wikipedia Current Events

Semantic Measures Datasets