University of Boston Experimentation and Evaluation Python Coding Task
Description
The purpose of assignment 2 is for you to get familiar with experimentation and evaluation, which is essential for the final project. So far, you have learned three machine learning algorithms. Simply comparing performance metrics between different models is not enough for evaluation. First of all, you have to select an appropriate evaluation metric because different metrics make different assumptions about what end users care about. Then, applying a significant test will allow you to determine whether a difference in performance reflects a true pattern or just a random chance. In this assignment, you need to compare the best model you developed for A1 with the baseline logistic regression model by selecting an appropriate evaluation metric and applying the Bootstrap-Shift test. In this assignment, the baseline model refers to the logistic regression model developed by using all provided features without any modification.
Dataset
You should use the same dataset (a1_data.csv) provided for Assignment 1. The goal is to compare the baseline classification model with the best classification model you developed for Assignment 1.
What to Do
Select an appropriate evaluation metric. So far, You have learned accuracy, precision, recall, and F-measure. You should select one of them. Then, justify your choice based on target label distribution, the assumption of the metric, and what end users care about. If you chose either precision or recall, you should state the target class (e.g., precision for patients with high medical expenditure, recall for patients with low medical expenditure).
Divide your dataset into 10 folds. Please refer to the example code for Week 7 (week7_evaluation.zip) for n-fold cross-validation. You should use the same set of folds while developing two models (i.e., baseline model without any treatment and the best model with your own treatment). Again, the baseline model refers to the logistic regression model developed by using all provided features without any modification.
- Then, apply the Bootstrap-Shift test to report whether the difference between two models signals a “true” and/or “real” trend or not. Please refer to the example code for Week 7 (week7_evaluation.zip) for the Bootstrap-Shift test implementation. You have to report the following things:
- Null hypothesis
- P-value and its interpretation
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."