Exam

Machine Learning - supervised machine learning exam questions and answers

What is supervised learning? A type of machine learning which is used when we want to predict a certain output from a given input and have examples of... [Show More] input/out pairs. The training set we feed into the algorithm includes the desired solutions (labels). labelled data: each instance comes with the expected output What are the two types of supervised machine learning? (1) classification - binary classification: 2 classes - multiclass classification: classification between more than 2 classes (2) regression Examples of supervised machine learning? (1) KNN (2) logistics/ linear regression (3) Support Vector Machine (SVMs) (4) Decision Tree and Random Forests (5) Neural Networks What is unsupervised learning? The training set is unlabeled. The algorithm tries to learn without a teacher. Examples of unsupervised learning - Clustering: K-Means DBSCAN Hierarchical Cluster Analysis (HCA) - Anomaly detection and novelty detection: One-class SVM Isolation Forest - Visualization and dimensionality reduction Principal Component Analysis (PCA) Kernel PCA Locally Linear Embedding (LLE) t-Distributed Stochastic Neighbor Embedding (t-SNE) - Association rule learning Apriori Eclat Dimensionality Reduction Goal: Simplify the data without losing too much information How?: Merge several correlated features into one. For example, a car's mileage may be strongly correlated with its age. Anomaly Detection v.s. Novelty Detection Novelty Detection: it aims to detect new instances that look different from all instances in the training set. Anomaly Detection: it aims to detect new instances that look different from all instances in the training set. For example, if you have thousands of pictures of dogs, and 1% of these pictures represent Chihuahuas, then a novelty detection algorithm should not treat new pictures of Chihuahuas as novelties. On the other hand, anomaly detection algorithms may consider these dogs as so rare and so different from other dogs that they would likely classify them as anomalies (no offense to Chihuahuas). semisupervised learning deal with data that's partially labeled Feature scaling A type of transformation. Why? The machine learning algorithm does not perform well when input numerical attributes have very different scales (a) max-min scaling (normalization): range from 0 to 1 how? by subtracting a min value and dividing by the max - the min (b) Standardization: By subtracting mean value and dividing by S.D. Pros: less affected by outliers Cons: values do not bound to a certain range--> may be a problem for some algorithm (ex: neural [Show Less]

Preview 1 out of 3 pages