- 18th Jul 2024
- 18:13 pm
This assessment uses data obtained from spotifycharts.com which contains the daily top 200 songs streamed on Spotify New Zealand in the year 2021 (January to October). On Canvas, you are provided with monthly files, from “Jan21.csv’ all the way to “Oct21.csv”. Each of these files contains information such as the song title, the artist name, and the total number of streams. Each file also contains various audio features for each of the songs such as loudness, danceability, acousticness, and valence.
Task 1: Write a loop to concatenate (vertically) each of the monthly files to form one big file. Call this big file as ‘full_data’. Using this data, compute the following statistics for the year 2021 and report them as Table 1:
- total number of observations,
- total unique songs,
- total streams, and
- number of streams per unique song.
Task 2: Using ‘full_data’, apply groupby to calculate the total daily streams over the year 2021 and then plot them as Figure 1 using matplotlib.
Task 3: Using ‘full_data’, calculate the total streams over the year 2021 by grouping the data by “Track.Name”. Sort the songs based on the total streams and select the top 5 songs streamed for the year. Next, calculate the total streams over the sample period by grouping the data by “Artist”. Sort the artist based on the total streams and select the top 5 artists streamed. In total, you should have 5 most listened songs and 5 most listened artists. Report them as Table 2.
Task 4: Valence measures a song’s musical positivity. It ranges from 0 and 1. Songs with high valence (closer to 1) sound more positive (e.g., happy, cheerful, euphoric), while songs with low valence (closer to 0) sound more negative (e.g., sad, depressed, angry).
Programming for Business Analytics - Get Assignment Solution
Explore a detailed assignment solution showcasing the Programming for Business Analytics - Assignment Solution. Access the complete code, report, and screenshots for research and reference purposes. For more details:
Option 1: Download the complete solution on our Python Assignment Sample Solution page.
Option 2: Connect with our Python Tutors for online tutoring and clarifying doubts related to this assignment.
Option 3: Check out the partial solution in this blog post.
Free Assignment Solution - Programming for Business Analytics
__Importing Basic Libraries:__
"""
# Commented out IPython magic to ensure Python compatibility.
import pandas as pd
import numpy as np
import os
import seaborn as sns;sns.set(style="white")
import matplotlib.pyplot as plt
# %matplotlib inline
import warnings
warnings.simplefilter("ignore")
"""__Task 1:__ Write a loop to concatenate (vertically) each of the monthly files to form one big file.
Call this big file as ‘full_data’. Using this data, compute the following statistics for the year
2021 and report them as Table 1:
"""
full_data = pd.DataFrame()
for csv in os.listdir("/content"):
if csv.endswith(".csv"):
print(csv)
data = pd.read_csv(csv)
full_data = pd.concat([full_data, data])
full_data.head()
""">(a) total number of observations:"""
print("Total number of observatioins are {} rows and {} columns. \n".format(full_data.shape[0], full_data.shape[1]))
""">(b) total unique songs:"""
print("Total unique songs are", full_data["track_name"].nunique())
""">(c) total streams:"""
print("Total streams are",full_data["streams"].sum())
"""> (d) number of streams per unique song:"""
full_data.groupby(["track_name"])["streams"].sum().to_dict()
"""__Task 2:__ Using ‘full_data’, apply groupby to calculate the total daily streams over the year 2021
and then plot them as Figure 1 using matplotlib.
"""
fig, ax = plt.subplots()
fig.set_size_inches(25, 7)
full_data.groupby(["date"])["streams"].sum().plot()
plt.title("Total daily streams over the year 2021")
"""__Task 3:__ Using ‘full_data’, calculate the total streams over the year 2021 by grouping the data
by “Track.Name”. Sort the songs based on the total streams and select the top 5 songs streamed
for the year. Next, calculate the total streams over the sample period by grouping the data by
“Artist”. Sort the artist based on the total streams and select the top 5 artists streamed. In total,
you should have 5 most listened songs and 5 most listened artists. Report them as Table 2.
"""
print("5 most listened songs")
full_data.groupby(["track_name"],as_index=False)["streams"].sum().sort_values("streams",ascending=False).head(5)
print("5 most listened artists")
full_data.groupby(["artist_names"],as_index=False)["streams"].sum().sort_values("streams",ascending=False).head(5)
"""__Task 4:__ Valence measures a song’s musical positivity. It ranges from 0 and 1. Songs with high
valence (closer to 1) sound more positive (e.g., happy, cheerful, euphoric), while songs with
low valence (closer to 0) sound more negative (e.g., sad, depressed, angry).
Using the column “valence” in the file ‘full_data’, apply groupby and calculate the average
valence for each day. Check the trend for valence over time, i.e. plot the average valence from
1 January 2021 to 31 October 2021 as Figure 2.
"""
fig, ax = plt.subplots()
fig.set_size_inches(25, 7)
full_data.groupby(["date"])["valence"].mean().plot()
plt.title("Average valence from 1 January 2021 to 31 October 2021")
Get the best Programming for Business Analytics assignment help and tutoring services from our experts now!
About The Author - Matthew Grey
Matthew Grey is a seasoned data analyst specializing in music streaming analytics. Proficient in Python programming, he excels in aggregating and analyzing Spotify data, focusing on trends and insights across different music genres. His expertise in data visualization and statistical analysis ensures comprehensive reporting and strategic decision-making in the realm of digital music consumption.