Advanced Analytics Techniques Assignment

Publish By: Admin,

Last Updated: 16-Sep-24

Price: $120

Assignment Task

Problem 1 - Reading the dataset

Q1. Read the first 10,000 rows from the credit card dataset provided in the assignment_data folder

Name your DataFrame df
Rename the column `PAY_0` to `PAY_1` and the column `default payment next month` to `payment_default`
Delete ID column

Q2. List which features are numeric, ordinal, and nominal variables, and how many features of each kind there are in the dataset. To answer this question

Find the definitions of the variables provided elsewhere in the course material (hint: make sure you do weekly tutorials)
Find the definitions of numeric, ordinal and nominal variables
Carefully consider the values of the data itself as well as the output of df.info().

Q3. Missing Values.

Print out the number of missing values for each variable in the dataset and comment on your findings.

Problem 2. Cleaning data and dealing with categorical features

Q1.

Use an appropriate pandas function to impute missing values using one of the following two strategies: mean and mode.
– Take into consideration the type of each variable and the best practices we discussed in class/lecture notes
Explain what data imputation is, how you have done it here, and what decisions you had to make.

Q2.

Print value_counts() of the `SEX` column and add a dummy variable named `SEX_FEMALE` to df using get_dummies()
Carefully explain what the values of the new variable `SEX_FEMALE` mean
Make sure the variable `SEX` is deleted from df

Q3. Print value_counts() of the `MARRIAGE` column and carefully comment on what you notice in relation to the definition of this variable.

Q4.

Apply get_dummies() to `MARRIAGE` feature and add dummy variables `MARRIAGE_MARRIED`, `MARRIAGE_SINGLE`, `MARRIAGE_OTHER` to df.
Carefully consider how to allocate all the values of `MARRIAGE` across these 3 newly created features
Explain what decisions you had to make
Make sure that `MARRIAGE` is deleted from df

Q5. In the column `EDUCATION`, convert the values {0, 5, 6} to the value 4.

Problem 3 Preparing X and y arrays

Q1.

Create a numpy array y from the first 8,000 observations of `payment_default` column from df
Create a numpy array X from the first 8,000 observations of all the remaining variables in df

Q2.

Use an appropriate sklearn library we used in class to create y_train, y_test, X_train and X_test by splitting the data into 75% train and 25% test datasets
– Set random_state to 4 and stratify the subsamples so that train and test datasets have roughly equal proportions of the target`s class labels
Standardise the data to mean zero and variance one using an approapriate sklearn library

Problem 4. Support Vector Classifier and Accuracies

Q1.

Train a Support Vector Classifier on the standardised data
– Use rbf kernel and set random_state to 3 (don`t change any other parameters)
Compute and print training and test dataset accuracies

Q2.

Extract 2 linear principal components from the standardised features using an appropriate sklearn library
Train a Support Vector Classifier on the 2 principal components computed above
– Use rbf kernel and set random_state to 3 (don`t change any other parameters)
Compute and print training and test dataset accuracies

Q3.

Comment on the suitability of the two classifiers to predict credit card defaults by commenting on (and comparing) the computed accuracies from the last two questions.
Make comparisons both within and across the two questions

Order this Assignment Now

Your answer will be 100% plagiarism-free

Enter Your Name

Enter Your Email

Enter Your Phone Number

Detail Instructions

Total: GBP120

Recent Post

Reflect On Their Learning And Professional Experiences To Enhance Professional Identity.
Reflect On Their Learning And Professional Experiences To Enhance Professional Identity.
Read More
Evaluating The Impact Of S&C On Club Head Speed: A Case Study Of Ben
Evaluating The Impact Of S&C On Club Head Speed: A Case Study Of Ben
Read More
Faculty Of Dental Medicine Rules For The Preparation And Writing Of The Bachelor’s Thesis
Faculty Of Dental Medicine Rules For The Preparation And Writing Of The Bachelor’s Thesis
Read More
Marketing Practices Analysis Of A Fast Fashion Brand Research Paper, Ireland
Marketing Practices Analysis Of A Fast Fashion Brand Research Paper, Ireland
Read More
Childhood Education Study
Childhood Education Study
Read More
MANU7015 Advanced Manufacturing Process
MANU7015 Advanced Manufacturing Process
Read More
Unit 2: Plant Identification and Classification
Unit 2: Plant Identification and Classification
Read More
A7078 Principles of Ruminant Nutrition
A7078 Principles of Ruminant Nutrition
Read More
5C21525 Holistic Care of Children
5C21525 Holistic Care of Children
Read More
CHEV030 Professional Practice
CHEV030 Professional Practice
Read More

Britain Writers Rated 4.8/5 based on 36826 Votes

Copyright © Britain writers 2022. All rights reserved.