Scalding
Algebird
Finagle
FlockDB
Finatra
Ambrose
Parquet
Summingbird
Bootstrap
Bower
Flight
Twemcache
Heron
Cassovary
16 December 2016
8 December 2016
Web Scraping Services
Labels:
big data
,
data science
,
distributed systems
,
intelligent web
,
metadata
,
natural language processing
,
semantic web
,
text analytics
,
webcrawler
,
webscraper
,
webservices
27 November 2016
PoolParty Academy
PoolParty Academy is an initiative to provide certifications for semantic web and for the PoolParty Semantic Suite as well as integration for metadata management. The list below highlights the various certification roles on offer. PoolParty is used extensively in industry to manage SKOS based semantic schemas and for the linked data management which are applied towards various enrichments for knowledge engineering in natural language processing tasks.
- Semantic Web Associate
- Knowledge Engineering Specialist
- Semantic Integration Expert
5 November 2016
2 November 2016
Clueless Interviewers
When one is stuck in a room with an interviewer and through the process one comes to a realization that the person has no clue about what they are talking about and yet they are recruiting for a Big Data Engineer. It seems such interview episodes are a common occurrence in the Big Data world where even managers or architects have no idea what Big Data is about nor how to tackle it for their next project. However, one would suppose that the first step would be to recruit sufficiently skilled individuals through a sufficiently experienced and professional hiring practice. It is even worse when the interviewer comes back with feedback which clearly displays their lack of understanding of the Big Data concepts leaving not only a bad taste but also a humorous impression with the obviously stated opinion of the interviewer. Such roles are even more difficult for human resources to recruit for as the number of keywords far out way their often limited vocabulary. The dynamic nature of Big Data is also a challenge as companies want to be able to use pragmatic and cost effective ways of implementing for the future. Training the management is often the right first step in order to define a convincing strategy towards adapting Big Data for projects. There is also a higher level of risk involved due to the heavy requirements of data cleansing and out of silos and into a formation of a data lake. Many companies are still struggling to understand Semantic Web and Linked Data and the benefits of such approach for Big Data. The complexity of the domain is often met with clueless management, interviewers, and human resources personnel who are adapting and tackling a dynamic environment for change where newly recruited data engineers and data scientists are expected to provide considerable input and guidance towards such a shift in Big Data adoption. Frustration is often met with talented engineers as they are relegated against more keywords on a CV rather than the context of use and even bemused interviewers trying to get by with only a few whimsical attempts to understand the Big Data Landscape. For many data engineers who have to wear multiple hats while seeped in annoyance from the inept data scientists. And, as we know everyone is calling themselves a data scientist these days. But, how many of them actually are qualified to do the job with a well rounded skill set in the area is fairly doubtful.
Labels:
big data
,
data science
,
distributed systems
,
Java
,
linked data
,
machine learning
,
microservices
,
python
,
scala
,
semantic web
CAP Theorem
The following guarantees of the Brewer's Theorem (CAP Theorem) play a balancing act in a distributed system especially in context of big data.
- Consistency
- Availability
- Partition Tolerance
29 October 2016
26 October 2016
Machine Learning Taxonomy
Machine Learning is about designing algorithms that provide a computer the means to learn, often from finding patterns in the data. The below outline the key taxonomy areas of machine learning.
Unsupervised Learning
Reinforcement Learning
Transduction
Learning to Learn
Subscribe to:
Posts
(
Atom
)