Mabble Rabble

13 July 2013

Integration of Global Information

Books and periodicals are categorized and held in libraries. Artifacts are categorized in museums and galleries. Reports and papers are indexed in archives. Patents are registered and indexed in patent listings. These places all individually store vast amounts of information. And, yet we see every year so many are closed down due to lack of readership and finances. One thing that such places lack is integration of resources and accessibility at a global scale. If only such places could be connected globally so they could be reached over the internet by students, researchers, and any one looking for information. Book publishing is a foregone and struggling business. People are now looking to ebooks, online, and flexibility of reading on handheld devices. It seems only natural that there are multiple gains that can be achieved from mutual integration of resources between such organizations. Not to mention the level of cost savings that could be made, the generation of a collective finance option to keep such organizations profitable, and merit their future existence. Such places are often the cornerstone of learning outside of schools and universities. We need to support them before they all start to crumble into history. I feel linked data and semantic web is the answer to making it all possible. Semantic web is a natural fit for library and archiving and linked data is a natural fit for making the resources all connected. Even so far as allowing for advertising revenue and subscription model could be achieved from such a collaborative network which brings together researchers, students, and all walks of learners alike into one easily accessible network of resources. Just imagine the amount of searchable information that could be achieved and the level of categorized information made available to all without ever having to physically visit the organization. It seems a plausible option for so many struggling organizations. At same time it means knowledge could be reached without bounds. A library ontology could be generated that establishes a network of shared resources for all. Most organizations should not hold any reservations to such an idea as it would mean a larger accessibility by subscription therefore more financial gains plus they will all get used a lot more. It seems only in time such endeavors are likely to occur in near future as semantic web and linked data progress into the mainstream use over the internet and define the ubiquitous aspect to technology.

CSS Frameworks

For small single page applications, a CSS Framework can be an unnecessary complexity. For larger projects it can be a real time saver. Not all CSS frameworks support all browser options or even are fully compliant. However, the two that I find useful and often worth mentioning are Twitter Bootstrap and Blueprints. These frameworks provide agility and flexibility with a conscious effort towards accessibility. They can be adapted in small to large projects with ease and can be structured to work with varying display sizes. A user interface design has to take into account not just a desktop browser but also mobile devices. It is often the measure of usability from mobile devices that the usefulness of an application design emerges. Whether one requires a fluid or fixed layouts, it is all about how responsive the design is that matters the most in reaching out to a far more varied audiences.

Project Managers Are A Hinderance to Developers

Often in commercial environments there are multiple separations of roles that end up putting people in separate lines of work streams. For a majority of developers, the biggest hurdle in teams are project managers. They rarely ever have the big picture nor do they understand complexity. A lot of times they don't even understand their own work. Developers rely on project managers to essentially manage the project and the team. But, what ends up happening is that most developers are forced to micro manage themselves because the project managers are just too incompetent in their own work. Project managers are even asked to go on formal courses in agile methodologies or learning standard principles of project management. But, when they come back they rarely ever practice any of them. Majority of project managers are focused towards one aspect within a company only the ladder up to further management rather than being productive or results driven. They display generally no appreciation of technology or understand any issues that developers face within certain processes. They even get in way of adding excessive amounts of unnecessary politics between developers and themselves and ultimately become the cornerstone of projects going from great pieces of work to mediocre. Project managers have no understanding of code quality or architectural impedance. Their main aim is to look good to higher up management and just treat their development teams like factory workers. In smaller companies that have a start up atmosphere the lack of project managers allows developers to take charge of their own work and it is here where projects become alive through the technical appreciation. In larger companies there are just too many incompetent project managers that have no understanding of the technical aspects and cannot relate to why certain pieces of work take longer than others. They seem to have a vague idea of most things but have complete indifference towards understanding the developers daily process cycle. As commercial environments start to become more and more dependent on developers it is often times the project managers that end up undermining their work and even their sense of work ethic to certain extent where developer turnover in companies increases. The way of most companies is that developers rely on lead and senior developers for guidance. The lead and senior developers often rely on architects for guidance, and the project managers are so arrogant that they almost always encroach and get in way of almost every one. In daily processes at work it may not appear so apparent at first, but as they mold into daily schedules of meetings, the patterns and behaviors of most project managers becomes apparent. They will run stand ups and regular meetings these are not because they want to keep the communication flow going but it is more so they can look good to senior managers that they are doing such a great job in micro managing teams or projects. But, in essence what they are really doing is playing politics at work and silently under the covers of process filling up their meeting calendars, without really being all that effective to developers. The conducive role of a project manager becomes an inconducive role for a developer, competing on the same field for increments, bonuses, and salaries. In most companies, project managers end up earning far more than most developers. Even though a developer does a lot of the work, even so far as doing a lot of project manager's work. How many times have developers seen themselves writing acceptance tests from user stories which are really the job of a project manager or business analyst. Such is the case in most companies. The politics of work become the norm driven by a facade of management practices irritating the likes of most developers in their daily passions for technology and creativity.

6 July 2013

Staves - Facing West

Possibly the most droning music I have heard yet. And, this song which almost seems like it was created out of sheer boredom, loneliness, and in a waterfront town by three women who had nothing else to do but wonder about some man. This man could even be a wandering stranger looking for some lodging or even a strict father figure. But, the song does have a meandering creepiness to it. On BBC the song is played often a thousand times almost enough to make one go mental, to switch the channel, and really wonder about what the point in paying for a TV License. And, thank heavens for digital TV with a vast array of channels. On youtube this song receives quite a lot of approval but I beg to differ. Possibly, the best song to have on when one is drunk, on their way home, while listening to this to make one realize the sheer pleasure of sleeping - enough to create the eerie imagery that life really can't be that boring as the nostalgic conundrums in the song.

facing west

23 June 2013

World War Z

An immensely boring and yet predictable movie from start to finish. The audience can pretty much anticipate to what is about to happen through out the movie. Perhaps, another one of those that tries to depict terror, panic, fear in the lives of people at an emotional level connecting them in some way or another to realism of the current world circumstances and of the unknown. In a lot of ways the World War Z reverberates a story line copy of previous movies, even sharing a few similarities with Cloverfield and War of the Worlds. An almost narrow minded portrayal of the unknown appear to be displayed and interpreted in a complete destruction and the annihilation of the human race. Another one of those fear mongering type of movies. However, this particular one shares a monotone feel of plot lines and a serious boredom factor. I am not truly sure why someone wouldn't quite fall asleep half way through watching it.

21 June 2013

Man of Steel

A masterpiece of action and storytelling all wrapped up in a twist produced an immensely entertaining movie. Man of Steel had a distinctive clarity from start to finish, from childhood details to super power background. It all seemed like one could follow through the movie without it turning into an overly long winded story line. Even superman's love interest was introduced in a subtle way right through. It plays on the character, it plays on the human side as well as the not so human side. It even surprises and outshines the evil. Finally, we are shown what it takes for the world to acknowledge the unknown and the different.

9 June 2013

Data Science Meetups

Data Science is all about telling a story through data that can be understood by non-practitioners. It combines the theory with the practice in process of extracting meaning from data and creating data products that provide method for producing and exposing information in specific contexts. This could involve the steps of data analytics, visualization, practical approaches to machine learning, data warehousing, and big data processing. In order to understand data science, one needs to get to grips with certain principles of database theory, agile approach, and the spiral dynamics. The practical steps generally involve organizing, packaging, and delivery of data where several questions need to be answered in order to get to grips with data processing or data mining in particular. The processes are then cycled through various technologies which are currently in an explosive growth and change.

A really good approach to start taking the steps towards understanding the methods and approaches is through background reading, practice, and attend meetups. There are several meetups and conferences being held from time to time across the world. There is one data science meetup in particular that is quite active - data science

19 May 2013

Semantic Web and Linked Data Storage

Semantic Web often times is solely dependent on an efficient back-end storage and indexing strategy from where most of the processing stems. It seems leaving out the most valuable aspect of a Semantic Web architecture towards the end as a way of interface is a bad move. One should always first think through the data layer first. Semantic Web is like a work flow of services in a pipeline and have to be thought through in that manner as everything depends on resources and the active querying of such resources. In fact, by extending the model by way of linked data VoID interlinks one further extends the data requirements exponentially.

There are generally three ways of approaching a back-end for semantic web. The first approach is to treat it as a pure W3C like a regular client-server model. The server being the triplestore and the client being the web interface of services. The second approach is usually apply a more granularity using property graphs with the Tinkerpop framework. In this manner a whole range of graph properties and options for NoSQL emerge. The third approach is to apply a standard relational model and to convert that into an RDF repository. In all three cases, an RDF interface layer similar to JDBC is required as well as possibly a search indexing layer.

The two most common interface layers which also have their own storage layers include Sesame and Jena. Sesame is the more versatile of the two providing more robust features as well a majority of the triplestores are based on this model. Jena appears to be a more strict W3C driven approach. In both models, the provided storage is not sufficient for production requirements as the data can grow exponentially. One obviously has to keep room for current and future data needs. Often times clustering would be required to scale out the SPARQL queries. In almost all cases a read-only SPARQL endpoint has to be provided for users to interface with. In SPARQL 1.1 even an update and an insert has been added on. However, these particular methods should be restricted to admin level.

Open source triplestores are generally quite limited for production use and so a workaround has to be applied at times to allow for scalability and storage needs. Currently, the top performing triplestores include Virtuoso, OWLIM, and Allegrograph both very much commercial and with quite a large toolset. The next best triplestore would be Bigdata which is a fairly good Open Source option providing clustering, sharding, and full-text indexing needs. It also has a zoo keeper connector. In terms of a property graph one can almost always use Neo4J or OrientDB. OrientDB provides a more liberal license option. Solutions that provide hadoop as the underline back-end storage layer will not perform very well due to the nature of its distributed design approach. The storage layer could be deployed to a clustered 64 bit and 4-8 CPU core production ready environment.

Semantic Web is really starting to take off and more and more interesting options are starting to emerge. However, it is still the case that open source solutions are lacking in production quality and are more experimental for research use. The field is still dominated by commercial players who provide a Swiss army knife of solutions in the field with an obvious premium. There is still a lot there to be done even in aspect of making Semantic Web more accessible for developers as the W3C specifications can be quite complex and in lot of ways there are just too many bewildering set of models to apply in a varied combination of usages. Perhaps, even the introduction of JSON-LD will facilitate the steps in making linked data more accessible for front-end developers. Simplicity and convergence is key in making Semantic Web the next evolution for Big Data and the Internet.

Java:
Sesame
Jena
Tinkerpop
linkeddataapi
any23
marmotta
stanbol
rdf2go
sesametools
groovysparql
pellet
owl-api
jsonld for java

Python:
Redland
RDFLib
Bulbflow
RDFAlchemy
Fuxi
Surf
ORDF
Django-rdf
Djubby
pysparql
sparta
Oort
sparqlwrapper

JavaScript/Nodejs:
RDFQuery
Tabulator

Semantic NLP:
KEA
OpenNLP
DBPedia Spotlight
Maui

Graph stores:
Neo4j
OrientDB
Allegrograph
Virtuoso
BigData
Ontotext
Titan
Stardog

W3C:
SPARQL 1.1
RDF
JSON-LD

Reconciliation:
GoogleRefine

Subscribe to: Posts ( Atom )