DALT7002 Data Science Foundations.

Publish By: Admin,
Last Updated: 08-Jul-23
Price: $120

DALT7002 Data Science Foundations - Oxford Brookes University

Learning Outcome 1. Demonstrate the ability to identify and integrate data of various types from traditional and alternative sources, and make informed judgements about their use in data science research

Learning Outcome 2. Critically evaluate the methodologies applied in data collection, data processing, data analysis & dissemination of research findings

Learning Outcome 3. Critically assess methods and data strengths and limitations combined to application of R and/or Python

Introduction

In this coursework you will prepare a data model that combines a range of data sets. We are primarily interested in the processes you take to achieve your data model, though you will need to produce a final data set and model.

Scenario

Oxford Brookes University would like to offer a new service to staff to encourage the brightest and best staff to join us, and in recognition of the fact that Oxford itself, can be a very expensive place to live.

This new service is a town advice service that recommends towns in Oxfordshire based on a certain key characteristics, these being:

House prices
Broadband speed
Crime in the area over the last month

They would also like to consider other factors such as:

Nearby rights of way
Distance from Oxford vs size of the road
Availability of Allotments

There may be other factors. So you should also gather more information from a member of Oxford Brookes academic staff to find about any other key issues that might affect a person`s choice of location.

Tasks

You must use datasets that are published on by the UK government, either centrally or through a public body that would be available to a member of the UK public. You should prepare a brief questionnaire about the knowledge acquisition and send it to a domain expert (in this case Dr. Younas) to gain an insight into any other data sources you may wish to query. Dr. Younas`s email


Using this information, you should produce a unified data set and model that could be used to drive a recommendation system, documenting and explaining all the processes that you undertake to achieve this data set and model. You must ensure that -

All data used is normalised to at least 3NF

You must use the MySQL server on SOTS to store the data or another MySQL server. You should include your tables as part of the report

Your model must use the three key characteristics

Your model may use the additional characteristic(s) suggested above or that arise from the knowledge acquisition session

The combined data set must be stored in a MySQL server

You should demonstrate that you can query the data set in R

You should have a simple recommendation system, written in R, that allows the user to specify a value in the range 0-10, for each of the three key characteristics and then produces a score for a town and displays the top 3 towns in order

The towns used are in Oxfordshire.

You may restrict the number of towns you look at to main towns, but you must justify your selection in your report