17 March 2014

Vertx

A new breed of event-driven server-side and asynchronous IO frameworks have emerged. Most are already aware of Nodejs as a specific Javascript aligned framework. However, Vert.x is emerging to be the pinnacle of simplicity and sophistication for a JVM enabled developer. It seems the model is just so flexible and scalable that it merits a clear win over Nodejs. Perhaps, the underlining question here is between Play Framework and Vert.x. Although, this platform is still fairly new, it does tend to support the polyglot programming. It would certainly help if there were more plugins available on the platform to make life even easier. Re-inventing the wheel may be the way to go with Vert.x. But, perhaps, the pain of starting from scratch is far outweighed by the sheer simplicity and versatility of the approach. In time we will see how Vert.x will transpire over the Nodejs approach with the many application user cases in production. At least, for the sake of security, Vert.x is a clear winner and even for scalability. Vert.x will be much preferred over the Nodejs in the financial services community. But, also by big data miners who may have a sheer dislike of the Javascript coding style. It will be the day of reckoning as to how such frameworks converge in time to provide some view of their usefulness. It also does seem that the development community is rather unfazed by such frameworks in a form of reluctance. Maybe, it is to the fact that they are just too new in their approaches to balance the developer training and cost of new development.

Freebase vs DBPedia

Freebase and DBPedia are both community supported semantic databases of content. However, on one hand Freebase is curated for people, places, and things. While, DBPedia takes most of the information from Wikipedia. The two databases are now interlinked in a linked data albeit in partial set of topics. However, their approach is different. Freebase utilizes the Meta Query Language as a customized option whereas DBPedia supports SPARQL. DBPedia also has a large academic community with other side semantic based projects, whereas Freebase is owned by Google. Most such databases in order to be classed as open data need to share a very open license policy on data. The choice of which one to use depends entirely on one's application needs. The curated data may be different across the two databases based on the data sources. In fact, even Freebase uses data extractions from Wikipedia. They both have different goals, schemas, and identifiers. Freebase perhaps is more diverse in its use of data sources. Freebase also makes it freely accessible to users to curate the data. However, in order for one to get an update on DBpedia, they would have to first update it on Wikipedia. Even the structure of data storage can be slightly different. Freebase is based on n-tuples whereas DBPedia is based on RDF. Freebase is more in tune with the open data community whereas DBpedia tries to follow the strict approaches of the Semantic Web. Even the tool development is mostly via third-party for DBPedia, but mostly from Google on Freebase as well as the user community. One could utilize both in a linked data to build meshable vocabularies, taxonomies, thesauri, or even topic maps.

16 March 2014

Beautiful Data

Data can take on many dimensions, shapes, contexts, and infinite forms. It is no wonder that information is the only boundary limitation with which so much data can be elaborated. One set of data alone can provide for so many informational uses. Perhaps, it is the information that is beautiful not really the data that is collected. The elegance of stories that can be told is the information driver to a data source. Visualizations inspire insights that are driven through the lens of information and contextualized from a set of data sources.

beautiful data images
visualizations awards
beautiful data with oreilly

Aurora Borealis

Interesting images on flickr for aurora borealis give a new dimension to the natural phenomena. Simulations of such levels can give huge meaning in the way we can illuminate data as well. Nature often is the miracle cure for many of the web and artificial intelligence challenges.

aurora borealis flickr

Wolfram Language

Gone are the days of when programmers have to define their own types of abstractions. Knowledge based languages are almost the new approach to making life easier. Wolfram Language is a symbolic, natural, knowledge-driven, and extremely large. But, it can be used in a multitude of specialized domains. The Mathematica uses it. The Wolfram Alpha uses it. What is so powerful is that the knowledge is pre-built inside the language making it aware of its domain semantics in a programmable context. It makes input of data and translation of output much easier. Enabling it to represent arbitrary data with ease. What Wolfram Language attempts to do is make the world more computable rather than being able to just generate information. General in its approach it combines a multitude of programming paradigms from symbolic computation, functional, to rule-based.  

Visualizing Changes In Habits

It would be a useful application to have that can provide visualizations on the characteristic habits of people over the web and how they change over time. In process, one can develop a sense of deduction and conduct sophisticated behavior analysis. Understand people and how social phenomena develops on the web can provide cues to predicting future intents in any given topical constraint of interest. The versatility of such an application would also not be intrusive but assistive to providing better all-round optimized services on web. Underlining habits define the core ideals of what develops into current tastes, trends, and interests of individuals or of particular groups of communities. It is an almost graphical depiction in sociology of the web. Visualizations often tell more than what words could ever foretell almost defining and exposing hidden facts. Even the domains can be contextualized to provide further focus points.

Web Search For Missing Persons

Public services really need to open up a lot of their data especially for emergencies as communities can really help in process through identifying patterns in data and trends. There is one area which is severely lacking and that is the assistance that is available for parents or families who have to struggle through the information of missing persons. They often have to rely on public services that are at times incompetent, slow, and restrictive with their investigations. Perhaps, there should be a web search available for people to search through missing persons and allow them to track histories, whereabouts as well as map associations which can link them to specific targets to allow a way to locate the victims and the perpetrators. At least it would help speed things up and give the victim a chance to be found before they are hurt or lose hope. Many families go through such an ordeal never to find their missing members again. There really needs to be more done by way of making government databases available for searching and then for members of community which can filter and gather results so people are able to search freely in the process. Semantic Web, Geo Location Awareness, and Machine Learning are a few approaches that could help in such a domain in analysis and contextualizing informational knowledge from data.

Google Hummingbird

The hummingbird is a new approach being devised by Google to search and sort through the information contained in web pages and the context of queries to result in the best answers. In fact, this is in addition to PageRank an often embedded algorithm. What the name essentially implies is precise and fast. Search momentum is towards more conversational approaches leaning towards question/answering that provides better results based on intent and context. The enhancement allows a way to connect a user's queries in context for smarter search results. A funneled approach to the search would imply that it takes the first step of browsing behavior, next the shortlisting of result sets, and perhaps even the condensation of buying as a form of intent. At top is almost the mirror of browsing for information. At the layer down implies that one is at an exploring stage for options, and the below stage implies the intentions of buying. Perhaps, this actionable intent provides for smarter content generation to publishers and better contextual marketing which implies better visibility, more valid answers to questions, and increase in contextual value to content. At least, theoretically speaking, as the approach to the new search algorithm is yet still relatively new.