IS 733- Data Mining
1(a) Briefly describe the boosting algorithm. State why it may improve classification accuracy.
Answer1a) Boosting is a machine
... [Show More] learning ensemble meta-algorithm for reducing bias primarily
and variance in supervised learning, and a family of machine learning algorithms which convert
weak learners to strong ones. A strong learner is a classifier that is arbitrarily well-correlated with
the true classification. Pruning could be applied to decision tree induction to help improve the
accuracy of the resulting decision trees. Boosting is a method for improving the predictive
power of classifier learner systems. It’s a set of classifiers that’s combined by voting; and boosting
by adjusting the weights of training instances.
1(b) What is the bias-variance trade-off for machine learning methods? Explain.
Answer 1b) In statistics and machine learning, the bias–variance trade-off is that the property of a
group of prognosticative models whereby models with a lower bias in parameter estimation have
the next variance of the parameter estimates across samples, and the other way around.
The bias–variance perplexity or bias–variance downside is that the conflict in attempting to at the
same time minimize these 2 sources of error that stop supervised learning algorithms from
generalizing on the far side their coaching set:
The bias error is a slip-up from inaccurate assumptions within the learning rule. High bias will cause
AN rule to miss the relevant relations between options and target outputs (underfitting).
The variance is a slip-up from sensitivity to little fluctuations within the coaching set.
1(c) Briefly describe the bagging procedure. Discuss why it may improve the accuracy of decision
tree classifiers, in terms of the bias-variance trade-off.
Answer 1c) Bootstrap aggregating, additionally known as material (from bootstrap aggregating),
may be a machine learning ensemble meta-algorithm designed to boost the steadiness and
accuracy of machine learning algorithms employed in applied. It additionally reduces variance and
helps to avoid overfitting. Though it's sometimes applied to call tree ways, it is often used with any
form of technique. Material may be a special case of the model averaging approach. [Show Less]