3 February 2017

Containerization

Docker/Swarm
CoreOS/RKT
Kubernetes
Canonical
OCI
Mesos
CloudFoundry Garden

Serverless Container Architecture with Funktion
CI/CD Automation with wercker and shippable
Alternatively, in combination with Jenkins for development, while RunDeck for operations.

2 February 2017

Text-Driven Forecasting

Text-Driven Forecasting is about building systems that are able to predict on the future by analyzing collection of a body of natural language documents. Often they predict numeric quantities about a certain event based on various textual sources/feeds (e.g. news, twitter, facebook, polling data, opinion blogs, financial reports, amazon reviews, economics data, etc) as input and gather information gain from aspects of sentiment analysis and subjectivity. Machine Learning algorithms that can be applied to such a domain can range from regression, deep learning, decision trees, and others. 

Examples:
Predicting movie reviews using social media
Predicting opinion polls using social media
Predicting stock volatility using financial data
Predicting government elections and referendums
Predicting product sales using social media
Predicting property prices in the future
Predicting risk of a potential course of action or decision

smith whitepaper

Related Courses & Resources:
Priberam Labs
Social Media Analysis & Computational Social Science
Natural Language Processing & Social Interaction
Computational Social Science
Social & Information Network Analysis
Text as Data
NLP for Social Science
Computational Social Science
Computational Linguistics / Computational Social Science
Predicting Economic Indicators from Web Text Using Sentiment Composition
Making Predictions with Textual Contents

Converting Natural Language to Queries

Distributed queries in form of natural language can be very versatile and useful for analytics in Big Data. Linked Data in form of a data lake can provide a way to semantically produce natural language questions that are then translated into queries especially in form of SPARQL. However, such approaches can further be extended into other types of queries. Natural Language Generation is another aspect of such conversion and tranformation steps. Often such approaches are replicated in a search engine or in semantic web where tokenized words are exposed using subject-predicate-object that are linked to a relative URI reference that map to an ontology schema such as from a custom knowledgebase like DBPedia. An application of such an implementation approach can be found in Quepy which uses transformations and semantic relations.

Data Science Competitions

kaggle competitions
crowdanalytix competitions
drivendata competitions
innocentive competitions
tunedit competitions
texata championships
topcoder competitions
data science challenge
EvalAI Challenge

best kept secret about data science competitions
data science bowl