Understanding our Earth is important as it provides for valuable answers. Nature provides us a means of developing cures as well as a way of understanding our place on Earth. As humans we want to be able to track all the living things on Earth and develop a connection with them. Nature also provides a huge amount of data on human life and the transitive effects overtime. Taking inspiration from Noah could be one form of improving on the Linnaeus taxonomy schema and providing a universal Semantic Web of Nature. This would allow for a linked data for scientists and the development of connected services for much needed research. One could then also utilize the open community on modifications to a type as new species and sub-species are discovered, similar in approach to Freebase and Wikipedia. Knowledge discovery in such a connected manner would allow for a multitude of research options for collaboration and interconnected sharing of both interests and findings. Often having two of each is enough to build a taxonomy similar to the Noah's Ark. Also, linked data on nature and wildlife would enable more applications in tracking animal behavior patterns as well as when they become at risk of extinction. Even total population counts through animal tagging can be semantically enabled. Perhaps, even the approach to Internet of Things would take ubiquity of applications to new heights. One could even build a taxonomy of animal communication and provide for natural language parsers in this domain. Semantic Web holds the key to unlocking a lot of the untapped potential of machines of today in providing for much needed intelligence for smart applications, especially for the real-world, where problems not only have complexities in uncertainty but also in multiple dimensions. There is much that we still do not know about the world we live in. The more we are able to contextualize and utilize machines for reasoning, the more we become productive and efficient in discovering knowledge.
20 July 2014
19 July 2014
Open Source ETL
ETL Tools are fundamental these days to an enterprise data workflow process especially as part of data integration. Firstly, data is extracted from external sources. The data is then transformed through a quality assurance process to meet specific needs. The data is then loaded to the target database. With extensive and diverse big data needs, the role of ETL tools has become ever more important for data processing requirements. There are plenty of commercial and open source tools in the market. Sometimes designing one's own solution suffices over a third party option. The below is a list of tools and libraries that may be available open source alternatives with their own unique approaches and limitations. One can also always utilize the cloud especially AWS EMR for same purpose of ETL.
Labels:
big data
,
data science
,
databases
,
Java
,
metadata
,
scala
,
software engineering
,
spring
16 July 2014
Metadata And Catalogs
Book publishing is a big business. However, overtime it has become more and more competitive both because of Amazon but also because of more people looking to use eBooks. Cataloging is often also a major focal aspect to libraries. Therefore, there is an aspect of downstream as well as upstream of workflow. Metadata is critical to most such publishing and cataloging endeavors. There have been many evolving metadata initiatives on the horizon both in past, current, and for the future. Also, many community works, especially in research, are also incorporating open metadata aspects with annotations. In long run, linked data will prove to be quite useful for connecting publishers and various libraries in a web of interconnected data for access. There may even be a synergy between publishers, libraries, educators, and learners as each plays a role in the various workflow process. Such aspects will also bring with them many challenges of data integration. The below links provide for much fuel for thought in the area as well as the way things in the area are moving towards.
15 July 2014
Germany vs Argentina
The pinnacle game to the World Cup 2014 made its way last Sunday to millions of viewers. Both Germany and Argentina displayed superb football skills and embraced their passion for the football game as well as for the pride of their nations as they went on the field. The game intensified towards the end of first half as both sides let of steam to score goals towards the second half. During the game one could wonder that it might have just led up to penalties in end. However, in an unexpected score from Gotze it brought the end to the Argentina dreams of World Cup glory in just a matter of seconds. Lionel Messi for whatever reason did not have what it takes to clinch a victory for Argentina and throughout the game he was relatively unimpressive as almost on the sidelines. Even his last free kick could not hold any surprise hope at the last remaining minutes of the game. As Germany celebrated their victory, the Argentina team could be seen utterly distraught. However, both sides played an impressive game. Argentina throughout the game had many chances of scoring and yet they kept missing to the surprise of the Argentina fans. The celebrations in Germany will mark the well deserved performance of their national team and provide for much shared learning of their efficient game plays during the whole World Cup. It seems the World Cup came and swiftly left us. For some it was filled with surprises while for others it was the real passion for the game. One must wonder what will Brazil do now after the event. Will they attempt to resolve the real issues of their nation or continue to spend on such events while their people struggle through economic hardship on the streets. It seems as many watched the World Cup, the reality often is left bare towards the end. All things come to an end as we look back to the games, and then look to the future.
13 July 2014
Big City Parking
In big cities parking can be a real struggle especially during rush hour or at times when there are specific holiday events on the go. Driving around to look for a parking space is a waste of time. One wants to be able to track parking and for private lots to be able to book in advance. Or, at least utilize geo-location to track an empty spot within an urban grid. Linked Data could play a part in this as well. Firstly, to build an ontology of public and private parking. Then to utilize linked data to build resource query through the web of interlinked data. Other resources like geonames could come in handy. One could then build a mobile application to consume the services and that looks up nearest empty parking spot in the vicinity of a particular location as well as to find out parking spaces within private lots for where advanced booking options may be available. In certain areas, there may even be schemes where private residences offer their parking spaces for rent which could also provide an additional point of lookup. These services become handy for the traveller who needs to get a parking space in a densely populated city area and avoid much frustration. They also help in reducing double parking as well as traffic when movement of vehicles is slowed down by the lack of parking availability. Even the idea of monitoring congestion and events in area could enlighten most during their travel and to inform as to whether parking spaces might be challenging at a particular time of day or season. Linking such methods to Sat-Navs would be quite useful too.
IBM Global Parking Survey
Ordinance Survey
Data.gov.uk
Data.gov
Freebase
IBM Global Parking Survey
Ordinance Survey
Data.gov.uk
Data.gov
Freebase
Labels:
big data
,
data.gov
,
data.gov.uk
,
intelligent web
,
linked data
,
semantic web
,
travel
,
visualization
Data Serialization
Serialization is an important step in converting objects or data states into particular storage formats and then reconstructed for further processing. The process of serialization of objects is called marshalling and the process of extracting data from deserialization of bytes is called unmarshalling. The benefits of data serialization involve: method for persisting objects for storage, a method for remote procedure calls, a method for distributing objects, and a method for detecting changes in data. Object serialization is supported by many languages. However, different data serialization formats provide for different efficiencies in performance or flexibility over domain contexts. Big data requirements often rely on efficient data serialization formats for processing that are not only compact but also provide native support for partitioning as well as schema evolution features. However, in other cases it may be more appropriate to rely on text formats of XML and JSON which provide for more sophisticated data structures with composite fields as well as hierarchical data.
Comparison of Thrift vs ProtoBuff vs Avro
Comparison of Data Serialization Formats
Understanding RDF Serialization Formats
RDF And Serialization Formats
Thrift
Avro
JSON
JSONLD
YAML
Protocol Buffers
MessagePack
XML
XML-RPC
Labels:
big data
,
computer science
,
data science
,
intelligent web
,
linked data
,
programming
,
semantic web
,
software engineering
List of Useful Python Libraries
There are an ever increasing amount of third-party libraries available in Python especially for it being open source and accessible to all. The below link provides for an aggregated list, organized by category, of some useful libraries that are available in the open community.
awesome python
awesome python
12 July 2014
Brazil vs Netherlands
Brazil were at a loss from the start. Their hopes and dreams had already vanished from the previous game with Germany, having horribly beaten on their own turf. As the game turned a few Brazilian fans could be heard in distance cheering for the team amid the disappointment of the majority. To add further dismay, a penalty was given which put Netherlands in the lead. Further reducing Brazil to an even darker hole of gloom. Their weaknesses in defense outshined throughout the game as evidence in the goals that Netherlands further mounting at them. The more goals that were scored, the less Brazil was seen to fight back. Perhaps, they just did not have the muscle left or the reason to fight. Was third place not good enough to fight for? It seems Brazil were still in the shadows of dealing with what they lost to even find reason in their game play with Netherlands as they had nothing left to lose. Netherlands dismounted their prowess and brought on the mighty goals to further clinch their win to the end. It was a one sided game from start to finish. Not a thing left to be shocked as so many anticipated Netherlands position against Brazil.
Subscribe to:
Posts
(
Atom
)