Data Analysis and Linear Regression Assignment

Assignment Task

Description

Please note the following guidelines for your project submission:

  • The project consists of two parts
    • Part 1 – Building POC for AI projects
    • Part 2 – Data Modeling
  • Your submission should consist of two presentation files in PDF format. One for part 1 and the other for part 2.
  • Please go through the following details about the project

Part 1 – Building POC for AI projects

Context:

Think of yourself as an intrapreneur or an entrepreneur who is trying to design an AI product. Any product design starts with the basic question – Who is the product designed for and what problem does it solve?

Once a leader answers the above question comprehensively, he will move on to what opportunities the product offers, and finally, once he is convinced the product is business viable, he/ she moves on to understand the technical feasibility of the product.

As part of this project, you will be ideating various steps involved in conceptualizing an AI product.

Submission Guidelines for Part 1:

  1. You would NOT be collecting any data as part of this project. You will only be ideating what the product would look like and what is its potential.
  2. You are free to choose any domain & problem statement for this project.
  3. You are free to create your own presentation [PPT] and submit it in PDF format.
  4. Please go through the sample solution (given under the project module) in case you need any further understanding.

Part 2 – Data Modeling

Business Context:

An over-the-top (OTT) media service is a media service offered directly to viewers via the Internet. The term is most synonymous with subscription-based video-on-demand services that offer access to film and television content, including existing series acquired from other producers, as well as original content produced specifically for the service. They are typically accessed via websites on personal computers, apps on smartphones and tablets, or televisions with integrated Smart TV platforms.

Presently, OTT services are at a relatively nascent stage and are widely accepted as a trending technology across the globe. With the increasing change in customers’ social behavior, which is shifting from traditional subscriptions to broadcasting services and OTT on-demand video and music subscriptions every year, OTT streaming is expected to grow at a very fast pace. The global OTT market size was valued at $141.2 billion in 2021 and is projected to reach $257.37 billion by 2025, growing at a CAGR of 16% from 2020 to 2025. The shift from television to OTT services for entertainment is driven by benefits such as on-demand services, ease of access, and access to better networks and digital connectivity.

Amidst evolving viewer preferences, OTT services are dedicated to fulfilling the rising demand for entertainment. Several platforms have witnessed a notable surge of 46% in consumption and subscriber numbers as audiences actively seek diverse content. By embracing advancements and sophisticated changes, these platforms are striving to provide users with comprehensive access to a wide array of content seamlessly. This ongoing innovation is projected to draw an expanding number of subscribers to OTT platforms globally.

ShowTime is an OTT service provider and offers a wide variety of content (movies, web shows, etc.) for its users. They want to determine the driver variables for first-day content viewership so that they can take necessary measures to improve the viewership of the content on their platform. Some of the reasons for the decline in viewership of content would be the decline in the number of people coming to the platform, decreased marketing spending, content timing clashes, weekends and holidays, etc. 

Objective :

As Head of Analytics at ShowTime, you have to determine the driving factors for first-day viewership. This could be achieved by:

  1. Analyzing and visualizing the data to draw inferences around the influence of different factors on the first-day viewership.
  2. Evaluating a Linear Regression model and observing the coefficients of the model to determine the factors that influence the first-day viewership.
  3. Provide suggestions and recommendations to the business based on the analysis.

Data Description:

The dataset contains the content release day view counts, along with other factors related to the content such as average visitors for the past week, the genre of the content, day of release, etc. It has 1000 row entries representing individual content pieces and 7 attributes that may affect the first-day viewership.