-
Notifications
You must be signed in to change notification settings - Fork 18
Easily access song lyrics from Genius in a tibble.
License
Unknown, MIT licenses found
Licenses found
Unknown
LICENSE
MIT
LICENSE.md
JosiahParry/genius
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
--- title: "Quickstart: geniusR" author: "Josiah Parry" date: "2/12/2018" --- # Overview This package was created to provide an easy method to access lyrics as text data using the website [Genius](genius.com). ## Installation This package must be installed from GitHub. ```{r, eval = F} devtools::install_github("josiahparry/geniusR") ``` Load the package: ```{r} library(geniusR) library(tidyverse) # For manipulation ``` # Getting Lyrics ## Whole Albums `genius_album()` allows you to download the lyrics for an entire album in a `tidy` format. There are two arguments `artists` and `album`. Supply the quoted name of artist and the album (if it gives you issues check that you have the album name and artists as specified on [Genius](https://genius.com)). This returns a tidy data frame with three columns: - `title`: track name - `track_n`: track number - `text`: lyrics ```{r} emotions_math <- genius_album(artist = "Margaret Glaspy", album = "Emotions and Math") emotions_math ``` ## Multiple Albums If you wish to download multiple albums from multiple artists, try and keep it tidy and avoid binding rows if you can. We can achieve this in a tidy workflow by creating a tibble with two columns: `artist` and `album` where each row is an artist and their album. We can then iterate over those columns with `purrr:map2()`. In this example I will extract 3 albums from Kendrick Lamar and Sara Bareilles (two of my favotire musicians). The first step is to create the tibble with artists and album titles. ```{r} albums <- tibble( artist = c( rep("Kendrick Lamar", 3), rep("Sara Bareilles", 3) ), album = c( "Section 80", "Good Kid, M.A.A.D City", "DAMN.", "The Blessed Unrest", "Kaleidoscope Heart", "Little Voice" ) ) albums ``` No we can iterate over each row using the `map2` function. This allows us to feed each value from the `artist` and `album` columns to the `genius_album()` function. Utilizing a `map` call within a `dplyr::mutate()` function creates a list column where each value is a `tibble` with the data frame from `genius_album()`. We will later unnest this. ```{r} ## We will have an additional artist column that will have to be dropped album_lyrics <- albums %>% mutate(tracks = map2(artist, album, genius_album)) album_lyrics ``` Now when you view this you will see that each value within the `tracks` column is `<tibble>`. This means that that value is infact another `tibble`. We expand this using `tidyr::unnest()`. ```{r} # Unnest the lyrics to expand lyrics <- album_lyrics %>% unnest(tracks) %>% # Expanding the lyrics arrange(desc(artist)) # Arranging by artist name head(lyrics) ``` ## Song Lyrics ### `genius_lyrics()` If you want only a single song, you can use `genius_lyrics()`. Supply an artist and a song title as character strings, and voila. ```{r} memory_street <- genius_lyrics(artist = "Margaret Glaspy", song = "Memory Street") memory_street ``` This returns a `tibble` with three columns `title`, `text`, and `line`. However, you can specifiy additional arguments to control the amount of information to be returned using the `info` argument. - `info = "title"` (default): Return the lyrics, line number, and song title. - `info = "simple"`: Return just the lyrics and line number. - `info = "artist"`: Return the lyrics, line number, and artist. - `info = "all"`: Return lyrics, line number, song title, artist. ## Tracklists `genius_tracklist()`, given an `artist` and an `album` will return a barebones `tibble` with the track title, track number, and the url to the lyrics. ```{r} genius_tracklist(artist = "Basement", album = "Colourmeinkindness") ``` ## Nitty Gritty `genius_lyrics()` generates a url to Genius which is fed to `genius_url()`, the function that does the heavy lifting of actually fetching lyrics. I have not figured out all of the patterns that are used for generating the Genius.com urls, so errors are bound to happen. If `genius_lyrics()` returns an error. Try utilizing `genius_tracklist()` and `genius_url()` together to get the song lyrics. For example, say "(No One Knows Me) Like the Piano" by _Sampha_ wasn't working in a standard `genius_lyrics()` call. ```{r, eval = FALSE} piano <- genius_lyrics("Sampha", "(No One Knows Me) Like the Piano") ``` We could grab the tracklist for the album _Process_ which the song is from. We could then isolate the url for _(No One Knows Me) Like the Piano_ and feed that into `genius_url(). ```{r} # Get the tracklist for process <- genius_tracklist("Sampha", "Process") # Filter down to find the individual song piano_info <- process %>% filter(title == "(No One Knows Me) Like the Piano") # Filter song using string detection # process %>% # filter(stringr::str_detect(title, coll("Like the piano", ignore_case = TRUE))) piano_url <- piano_info$track_url ``` Now that we have the url, feed it into `genius_url()`. ```{r} genius_url(piano_url, info = "simple") ``` ___________ # On the Internals ## Generative functions This package works almost entirely on pattern detection. The urls from _Genius_ are (mostly) easily reproducible (shout out to [Angela Li](https://twitter.com/CivicAngela) for pointing this out). The two functions that generate urls are `gen_song_url()` and `gen_album_url()`. To see how the functions work, try feeding an artist and song title to `gen_song_url()` and an artist and album title to `gen_album_url()`. ```{r} gen_song_url("Laura Marling", "Soothing") ``` ```{r} gen_album_url("Daniel Caesar", "Freudian") ``` `genius_lyrics()` calls `gen_song_url()` and feeds the output to `genius_url()` which preforms the scraping. Getting lyrics for albums is slightly more involved. It first calls `genius_tracklist()` which first calls `gen_album_url()` then using the handy package `rvest` scrapes the song titles, track numbers, and song lyric urls. Next, the song urls from the output are iterated over and fed to `genius_url()`. To make this more clear, take a look inside of `genius_album()` ```{r} genius_album <- function(artist = NULL, album = NULL, info = "simple") { # Obtain tracklist from genius_tracklist album <- genius_tracklist(artist, album) %>% # Iterate over the url to the song title mutate(lyrics = map(track_url, genius_url, info)) %>% # Unnest the tibble with lyrics unnest(lyrics) %>% # Deselect the track url select(-track_url) return(album) } ``` ### Notes: As this is my first _"package"_ there will be many issues. Please submit an issue and I will do my best to attend to it. There are already issues of which I am present (the lack of error handling). If you would like to take those on, please go ahead and make a pull request. Please contact me on [Twitter](twitter.com/josiahparry).
About
Easily access song lyrics from Genius in a tibble.
Topics
Resources
License
Unknown, MIT licenses found
Licenses found
Unknown
LICENSE
MIT
LICENSE.md
Code of conduct
Stars
Watchers
Forks
Packages 0
No packages published