THE ROLLER COASTER STORY
Goal
Build an article about the new roller coaster that is set to surpass all others in the world in the three main categories: speed, length, and height.
This story was created as a coursework project at Columbia Journalism School, with a primary focus on the workflow: How to develop a data-driven idea, find and clean the data, visualize it, and build a webpage within a limited timeframe. Since this process was new to me, I chose a topic with straightforward data that still holds journalistic significance. I wanted to ensure I had enough time for each step, as managing a data-driven workflow that is a bit differend from a traditional text-only story.
Tech stack used:
HTML / CSS / Numbers / DataWrapper
Process summary
- Getting the data
I found the database from source links of the roller coaster Wikipedia page. It turned out that ChatGPT could easily convert screenshots from the Roller Coaster Database into CSV files, eliminating the need for scraping. Once I had the CSV files, I cleaned the data in Numbers to fit my needs. With this kind of simple data, using Numbers was the most convenient approach.
The Roller coaster DataBase: https://rcdb.com/rhr.htm
- Visualizing the data
I used Datawrapper to create the charts. I experimented with different approaches to build the charts but in the end I decided that simplicity is best. Used classic bar charts for all three visualizations, giving the article a clean and consistent look.
Here are the links for the charts:
Height: https://datawrapper.dwcdn.net/GVy0S/10/
Lenght: https://datawrapper.dwcdn.net/hHof7/5/
Speed: https://datawrapper.dwcdn.net/hka2l/6/
- Building the webpage
I started by mixing two different templates, found a great photo on www.pexels.com, and finally embedded my Datawrapper charts into the page.
Code in this GitHub repo: https://github.com/rosakettumaki/roller_coaster_story/blob/main/index.html
Link to the page: https://rosakettumaki.github.io/roller_coaster_story/
What did I learn? How could this be improved?
This is just the beginning of my journey as a data journalist, and this project was an incredibly valuable learning experience. I didn’t face any major issues, but there were plenty of small, frustrating challenges, especially with styling and setting up a properly functioning GitHub repository.
With more time, I’d like to create more innovative charts and possibly include additional information about the manufacturers and costs of these roller coasters.
On a more technical level, with some adjustments to the framing of the text, an auto-updating scraper could be developed to pull real-time data from the Roller Coaster Database.