Since we’ve never worked with audio data or classification of audio data we wanted to try working with data that is structured as such. We ask the question: how do the audio features from songs, specifically Spotify Tracks compare to each other?
Is there a relationship between the some of these features such as tempo correlating with danceability/energy/liveness and if so how are they correlated. Additionally, how can we use these features to cluster songs based on these audio tracks of songs being coverted to numeric features?
Certain audio features will be statistically different between the distribution of certain genres. These differences in distributions will allow us to perform Unsupervised Learning on the data to cluster the songs into different groups / listening personas.
For example, the mean "tempo" of Pop Artists will be higher than that of Ballad Singers since Pop songs tend to be more upbeat and fast.
If numerical data is extracted from the songs then models can be trained to cluster / classify songs into different groups since there will be enough difference between certain features between certain groups. This approach of comparing audio features between two groups can then be applied to other projects such as comparisons of living beings/objects to classify the two.
This project consists of the following components:
- Audio Feature Extraction: We extract audio features on the Top Charting Songs from Spotify's API
- Model Selection: Performing different Supervised and Unsupervised Learning algorithms
- Evaluation: We evaluate the supervised model's performance using accuracy, precision, recall, and F1-score metrics.
- Adaptability: The project is designed to be easily adaptable for other audio classification / clustering tasks.
- Python 3.8+
- librosa
- TensorFlow
- Keras
- NumPy
- pandas
- scikit-learn
- Optional: Cuda (if training on an Nvidia GPU)
To install the project and its dependencies, follow these steps:
- Clone the repository:
git clone https://github.com/COGS108/Group_Sp23_The_group_chat
- Change to the project directory:
cd Group_Sp23_The_group_chat
- Install the required dependencies:
pip install -r requirements.txt
This project can be easily adapted to other audio classification tasks, such as:
- Identifying speakers in a conversation
- Detecting emotions in speech
- Recognizing animal sounds or bird calls
To adapt the project, simply prepare your data and follow the usage instructions outlined above.
We appreciate all contributions to this project. If you would like to contribute, please open an issue or submit a pull request on the project's GitHub repository.