27 February 2017

Model Evaluation Techniques

data mining map

Things to watch out for in cross-validation:
  • when training data forms a representative sample of population, new data should have representative coverage of this training data, otherwise estimates are optimistic and as such minimize the bias in training data
  • when working with temporal datasets, structure the cross-validation so all training set data is collected before the testing set
  • when working with larger number of k-folds, the better the error estimates will be but longer the program will take to run, 10-folds or more is better, for models that predict quickly use leave-one-out cross validation

ROC curves are applicable on binary classification where predictions are divided into negative and positive classes. The area under the ROC curve is the AUC or the area under the curve which is another evaluation metric. On multiclass one uses the one-versus-all trick. In most cases of multiclass, one uses both the ROC curve and the confusion matrix. The confusion matrix shows the class-wise accuracy using a two-by-two diagram. Regression performance is measured using the root-mean-squared error, MSE, or R-squared. Other regression evaluation metrics include: AIC and BIC. A brute-force grid search is a standard way to optimize the choice of tuning parameters which ties the strategies between cross validation and model evaluation. 

Rated Funds

Rated funds

Elasticsearch Graph

Elasticsearch Graph
elasticsearch expands relationship modeling with graph

24 February 2017

Supervised Learning Use Cases


Example Use CasesType of ML
Spam FilteringClassification
Sentiment AnalysisClassification
Fraud DetectionClassification
Customer Ad TargetingClassification
Churn PredictionClassification
Support Case FlaggingClassification
Content PersonalizationClassification
Detecting Manufacturing DefectsClassification
Customer SegmentationClassification
Event DiscoveryClassification
GenomicsClassification
Drug EfficacyClassification
Stock Market PredictionRegression
Demand ForecastingRegression
Price EstimationRegression
Ad Bid OptimizationRegression
Risk ManagementRegression
Asset ManagementRegression
Weather ForecastingRegression
Sports PredictionRegression
Product RecommendationRecommendation
Job RecruitingRecommendation
Netflix PrizeRecommendation
Online DatingRecommendation
Content RecommendationsRecommendation
Incomplete Patient RecordsImputation
Missing Customer DataImputation
Census DataImputation

Mern Stack

Mern

22 February 2017

Outstanding Ontologies

There are different types of ontologies ranging from knowledge representation ontologies, domain ontologies, linguistic ontologies, and top-level ontologies. A selection of a few examples from different types are provided below.

Knowledge Representation Ontologies:
Frame Ontology
OKCB

Top-Level Ontologies: 
Cyc
SOWA
Standard Upper Ontology

Linguistic Ontologies: 
Wordnet
Generalized Upper Model
Sensus
Eurowordnet
Mikrokosmos

Ecommerce Ontologies (Domain Ontology): 
United Nations Standards Products and Services Codes
North American Industry Classification System
Standard Classification of Transported Goods
E-Cl@ss
RosettaNet

Medical Ontologies (Domain Ontology):
GALEN
UMLS
ON9

Engineering Ontologies (Domain Ontology):
EngMath
PhysSys

Enterprise Ontologies (Domain Ontology):
Enterprise Ontology
TOVE

Chemistry Ontologies (Domain Ontology):
Chemicals
Ions
Environmental Pollutants

Knowledge Mgmt Ontologies (Domain Ontology):
KA Ontology - Project, Organization, Person, Publication, Event, Research-Topic, Research-Product

Nature.com Subjects Ontologies

Infrastructure as Code & Automation

Terraform / Nomad / Vault / Consul (Hashicorp)
Cloudformation (Troposphere)
Boto
Chef
Puppet
Heat
SaltStack
Ansible
Fabric
Pallet
Rundeck