Assignment On GSUNsolar Energy

Assignment Task

Question 1

Imagine that you work for an energy tech company, namely, gSUNSolar. The company is active in selling and installing solar panels for homeowners. The recent increase in the price of oil and gas has increased interest in new renewable energy sources; therefore, gSUNSolar wants to get the most out of this situation.

In this regard, the CEO decided to run an ad campaign on Facebook. The CEO decided to run the following two different types of ad campaigns on Facebook:

1) ‘No targeting’: do not target any specific user on Facebook

2) ‘Targeting’: target users who are interested in one of the following topics (which is provided by Facebook):

“Vegan food”
“Saving on energy bills”
“Saving the planet”
“Luxury products”
“Green energy”

The CEO of gSUNSolar is now done with the campaign and is asking you to evaluate the results. In particular, the CEO would like to understand the following:

A) Did the ‘Targeting’ campaign generate more shares (compared to the ‘No targeting’ campaign)? (explain)

B) Which topic(s) should be considered for targeting users in future campaigns (that potentially create more shares)?

To this end, you are provided with a sample of Facebook campaign results, containing around 100,000 users (around 40% were in the ‘Targeted’ campaign and the rest in the ‘No targeting’ campaign; see ‘gSUNSolar_a.csv’ dataset). The dataset includes the following variables:

Variable Name	Description
id	ID of the user (anonymized by Facebook)
share	number of times the user shared the ad
target	“yes” if the user was in the ‘Targeted’ campaign and “no” otherwise
topic	the topic that the user was interested in:= a: “Vegan food”= b: “Saving on energy bills”= c: “Saving the planet”= d: “Luxury products”= e: “Green energy”
device	the device type of the user
distance	the distance of the user from the location of the company (in miles; basedon the user’s IP address)

browsing_time

time (in minutes) spent on the ad

Use the provided dataset (i.e., ‘gSUNSolar_a.csv’) and run a linear model that helps you to

address CEO’s questions.

In addition to answering the questions raised by the CEO, you conducted a similar analysis (as above) to investigate:

C) Did the ‘Targeted’ campaign generate more profit than the ‘No targeting’ campaign? Discuss ‘why’ based on your results (10 points).

To investigate the question in part C, you use the same dataset as above and replace the

variable ‘share’ with two other variables: ‘profit’ (i.e., expected profit in £ realized from those users that shared the ad) and ‘ad_cost’ (i.e., the money in £ that you paid to Facebook to display the ad to a user).

Notes that you should consider in your answer:

Include your R code and its respective results in your
Make sure you clearly explain, justify, and detail all the assumptions and steps in your solution. These might include data cleaning (e.g., dropping variable(s), observation(s), changing type of variable(s), ) or any other assumptions or steps.
Carefully and completely interpret your results (including all your coefficients).
Critically evaluate the implications (based on all your results) for gSUNSolar. Make sure that you use specific and concrete examples in your solution.
The only part not counted in your word limit is your output from the RStudio’s

console. Everything else (e.g., your code, tables, and words in your figures) counts.

Question 2

An apparel retailer would like to understand its current customers better. To this aim, the retailer is asking you to: Identify and group its existing customers into meaningful clusters that individuals within a cluster are similar to each other but different than those individuals in other clusters.

To this end, you put together a dataset (i.e., ‘Retailer.csv’) including the following list of variables:

Variable Name	Description
ID	ID of the customer
service.sat	expressed satisfaction with the retailer(0 – 100, where 0 is extremely dissatisfied)
sustainable	the number of previous ‘sustainable products’ that the customer purchased
male	“yes” if the customer is male, “no” otherwise
rent	“no” if the customer owns a house, “yes” otherwise
income	the income of the respective customer (in £)
child	no. of children in the insurer’s family
referral	the number of other customers that the focal customer referred to the retailer

Notes that you should consider in your answer:

Based on the structure and the information in the dataset, apply your suggested method using R. Include your R code and its respective results in your
Make sure you clearly explain, justify, and detail all the assumptions and steps in your These might include data cleaning (e.g., dropping variable(s), observation(s), changing type of variable(s), etc.), your decision (and justification why!) on the number of clusters, or any other assumptions or steps.
Carefully and completely interpret your results. Your answer should cover but not be limited to explaining why the final solution is appropriate, describing the characteristics of the clusters, and discussing managerial implications.
Critically evaluate the implications (based on your results). Make sure that you use specific and concrete examples in your
The only part not counted in your word limit is your output from the RStudio’s

console. Everything else (e.g., your code, tables, and words in your figures) counts.

Question 3

Assume you are the CEO of a real estate company. You would like to gain more insights about the market.

Variable Name	Description
ID	ID of the property
price	price of the property in £
type	type of the property
bedrooms	number of bedrooms the property has
bathrooms	number of bathrooms the property has
area_a	area of the property (in m²)
area_b	area of the property (in ft²)
furnished	the furnishing situation of the property
level	the level that the property is located
price_b	price of the property converted from £ into $
payment_option	the payment option that the property is available for purchase
delivery_term	the delivery situation of the property

Based on the structure and the information in the dataset

Find the correlation between ‘price’ and ‘area_a’ and interpret it; Find the correlation between ‘area_b’ and ‘area_b’ and interpret it
Suggest a tree-based model that allows you to understand what house features affect the property’s price (in £). Apply your suggested method (using R) and explain your results
Evaluate your model performance in B. (note that you are not required to split your dataset into train and test;

Notes that you should consider in your answer:

Include your R code and its respective results in your
Make sure you clearly explain, justify, and detail all the assumptions and steps in your These might include data cleaning (e.g., dropping variable(s), observation(s), changing type of variable(s), etc.) or any other assumptions or steps.
Carefully and completely interpret your
Critically evaluate the implications (based on all your results). Make sure that you use specific and concrete examples in your
The only part not counted in your word limit is your output from the RStudio’s