Spark
Flink
DataFlow/Beam
Streamsets
awesome streaming
29 October 2016
26 October 2016
Machine Learning Taxonomy
Machine Learning is about designing algorithms that provide a computer the means to learn, often from finding patterns in the data. The below outline the key taxonomy areas of machine learning.
Unsupervised Learning
Reinforcement Learning
Transduction
Learning to Learn
Scala Data Tools
A list is provided below of the general mathematics and machine learning data tools that have emerged in Scala aside from the Hadoop and Scala API's for databases.
- Algebird: Twitter’s API for abstract algebra that can be used with almost any Big Data API.
- Factorie: A toolkit for deployable probabilistic modeling, with a succinct language for creating relational factor graphs, estimating parameters, and performing inference.
- Figaro: A toolkit for probabilistic programming.
- H2O: A high-performance, in-memory distributed compute engine for data analytics. Written in Java with Scala and R APIs.
- Relate: A thin database access layer focused on performance.
- ScalaNLP: A suite of Machine Learning and numerical computing libraries. It is an umbrella project for several libraries, including Breeze, for machine learning and numerical computing, and Epic, for statistical parsing and structured prediction.
- ScalaStorm: A Scala API for Storm.
- Scalding: Twitter’s Scala API around Cascading that popularized Scala as a language for Hadoop programming.
- Scoobi: A Scala abstraction layer on top of MapReduce with an API that’s similar to Scalding’s and Spark’s.
- Slick: A database access layer developed by Typesafe.
- Spark: The emerging standard for distributed computation in Hadoop environments, as well in Mesos clusters and on single machines (“local” mode).
- Spire: A numerics library that is intended to be generic, fast, and precise.
- Summingbird: Twitter’s API that abstracts computation over Scalding (batch mode) and Storm (event streaming).
25 October 2016
Reactive Manifesto
The Reactive Manifesto is an effort to provide a definition of what a reactive system should look like with four sets of characteristics:
- Message or Event-driven: As a baseline the system needs to respond to messages or events
- Elastically Scalable: System needs to meet scale out demands (horizontal scaling via processes, cores, nodes)
- Resilient: System needs to be able to recover gracefully from failures
- Responsive: System is available for service requests even if this means graceful degradation of failed components during high traffic
Reactive Extensions
Functional Reactive Programming
Akka (Actors Model)
Labels:
big data
,
data science
,
distributed systems
,
event-driven
,
intelligent web
,
Java
,
scala
,
software engineering
21 October 2016
Alternatives to Kafka
Kinesis
RabbitMQ
ZeroMQ
Kudu
Storm
Samza
SQS
Redis
Aeron
MAPR Streams
Kafka for Beginners
Confluent
RabbitMQ
ZeroMQ
Kudu
Storm
Samza
SQS
Redis
Aeron
MAPR Streams
Kafka for Beginners
Confluent
One must make note that Storm and Samza can in fact be used along side Kafka in a data pipeline. It is the context of how one plans to use a platform, invariably dictated by the given constraints of the problem at hand, which may be in form of either batch or real-time streams for that matter.
18 October 2016
Beer Slangs
Homebrew uses beer analogy as a MAC package manager. Beer is also a staple for social gatherings with the data science field. It has become an essential element of society. Over the years it has evolved with a diverse set of regional slangs as well as the variety of flavors from around the world. Even an ontology can be produced for the consumable term for beer in form of a concept or thing as well as a product with a set of ingredients, categories, and tastes. In process, helping people to explore and produce a recommendation graph to associate to their evolving tastes, merry meet ups, and as a choice for food accompaniment.
beer slang
thrillist
beerslanging
15 brewtastic ways say beer
craftbeer
alldownunder
irishdrinking
1800s beer slang
Labels:
beer
,
big data
,
data science
,
food
,
machine learning
,
recommender
,
sentiments
,
society
,
uk
13 October 2016
Frozen Yogurts in London
Frozen yogurts are an interesting analogy of applying machine learning or specifically data science towards understanding the customer based on the scoops and taste choices. Analytics has given way towards self-service frozen yogurts putting the choice of the flavors at the hands of the user in process improving the customer experience. This defines a value shift towards the user and the association of data that relates to them. It also shows huge investments do not need to be made to shift business models. A self-service actually reduces labor costs. This is all part of analytics towards the maximization of revenue. By shifting the control to the user, one can allow a customer to attain better satisfaction and a sense of assurance that they are getting their money's worth. The below list provides a few interesting frozen yogurt places in a dynamic society of London.
Pinkberry
Snog
Itsu
Frae
Moosh
Moto Yogo
Yoomoo
Yogland
Subscribe to:
Posts
(
Atom
)