KL7010 Principles of Data Science.

Publish By: Admin,
Last Updated: 05-Sep-23
Price: $120

Assignment - The selection, application and evaluation of data science methods, tools and techniques.

Task Overview

In this assignment, you will be required to select, apply and evaluate a choice of data science methods, tools and techniques on a sizeable dataset of your choosing. You will justify the choice of dataset in terms of the problem being investigated, explore the dataset and describe and justify the methods that will be used in the investigation. After applying these methods, you will then discuss the findings that have been produced, and critically reflect upon the process and the outcomes. This documentation will take the form of a 4000-word report.

Task Scenario

You have been provided with access to three datasets; all are available on Blackboard. The data covers the following scenarios:

 Predicting if credit card clients will default on monies owed.
 Predicting the cancellation of hotel bookings.
 Predicting the occurrence of heart disease among patients.

More information on each scenario can be found in the appendix to this document.

You have been given the choice of any one of the above scenarios as your project. Your task is to produce a model that possesses predictive capacity with regards to the response variable within the dataset. Where possible, you will also provide insight into the feature importance with regards to the predictive capacity of your model.

All three datasets have been cleaned and are ready for use, however you may still wish to conduct some data preparation and/or transformation so that the data is in an appropriate condition and format for the analysis methods that you wish to use. You may choose to use any methods you wish to tackle the chosen problem; however, you must justify the use of your approach.

The key components of this task that you must complete are:

Explore the data so that you understand the structure, characteristics and limitations of the dataset.

Identify the forms of analysis that will be able to produce a successful outcome for the scenario. Ensure that the chosen method(s) are suitable for use on the dataset that you have chosen to use, and justify the use of your chosen approach.

Process the data into a condition suitable for the model building to be performed, including the selection of features to be used within the model.

Build a model that allows for the response variable in the dataset to be predicted.