The Echo Nest provides a hotness parameter for each track in its catalogue, measuring how the track is performing with regards to the number of times it has been streamed, social commentaries and editorial volume. We want to investigate whether or not it is possible to predict the hotness score based on the audio summary of the track using mainly methods taught in the course. The hotness parameter for a track is not static and will vary with time. Our solution might therefore only be true for the snapshot of time we are currently looking at.

Initial data analysis

Due to the large catalogue of music content from the Echo Nest we will have to only consider a small subset of the data available. Therefore, we have chosen to limit ourselves to only consider the top 200 hottest artist within a genre. The tracks associated with a hot artist will individually have different scores which will hopefully be due to differences in the measures given in the audio summary and not just due to the release date.

Since the Echo Nest has large diversity in the music genres we can not consider a large sample of different genres as the hotness parameter would not be distributed equally and tracks from less familiar genres would have a lower hotness score compared to similar tracks from a more mainstream genre. We will therefore only try to predict the hotness score within genres. A natural extension would of course be to also consider similar genres.

If we consider the genre with the highest familiarity score, namely rock then the dataset would consist of 200 artists and 48854 tracks. The hotness score for each track is in the left histogram showed below.

We remove the large amount of tracks having a hotness score of zero. Due to the size of the fraction we expect that it will have a negative impact on a prediction of the hotness score. If we remove all tracks having a hotness score of zero we receive the score distribution showed in the right histogram. The distribution now looks like being Gaussian (though it is skewed slightly to the left) with mean 0.22 and standard deviation 0.12.

Based on the audio summary of each track we have collected 13 features. Each of them are explained under data collection. Even though the dimensionality of the feature space is low we can try to investigate if it can be reduced further. This can be done using singular value decomposition in order to investigate if any hidden lower dimensional structures are able to explain a sufficient amount of the variation. If we look at the results from the decomposition we note that none of the components explains a large fraction of variation and we can thus not reduce the dimensionality of the feature space based on the SVD analysis. This might indicate that either all features can contribute to a prediction or that the chosen features have no or only a small influence on the hotness score and will in this way only act as noise in a prediction.

from sklearn.decomposition import PCA
pca = PCA()
pca.fit_transform(X_information[Y_score != 0.000000,:])
print(pca.explained_variance_ratio_)

[ 0.18764303  0.14408325  0.09093245  0.08635227  0.08251364  0.07224769
  0.06641602  0.06461839  0.06333539  0.05839676  0.04055396  0.03034258
  0.01256458]

From the analysis it follows that we have to try predicting using all features, and we will do this in the following test.

The data we will consider is collected on May the 11th 2015.

Predict hotness score

The hotness score is defined to be a numerical value between 0-1. We have chosen to define 5 different classes containing the scores and these can be found in the table below.

We can then use a multi-class classifier to predict the hotness class a track belongs to. Using 10 fold cross-validation the fraction of correct classifications for a Naive Bayes and Random Forest classifier is measured. We find that the Naive Bayes classifier is able to predict the correct class for 19.3 % of the tracks. For the Random Forest classifier we find this number to be 53.9 %.
The classification rates are not particular good, especially when taking the distribution of hotness score into consideration as the scores mostly are placed in two of the classes. It therefore looks like we are not able to predict the hotness score for rock tracks only based on audio features. One reason for this could be due to the case when similar tracks, seen from a audio perspective, from an artist receive different scores due to some external factor. We would therefore require much more information about e.g. the artist, producer and lyrics in the track if we should perform a better classification.

Class	1	2	3	4	5
Score levels	]0.0-0.2]	]0.2-0.4]	]0.4-0.6]	]0.6-0.8]	]0.8-1.0]

Hotness Prediction

Initial data analysis

Predict hotness score