Code for cleaning data for my project at Digital academy: Data
The main topic of my student project is how COVID-19 affected distribution of medication to Czech Republic. I was comparing datas from january 2019 to march 2020. In order to get the most precise answer for this question I needed to get amount of active substance in package and size of package for every record in database. Since every pharmaceutical company label that information differently I wrote this code to parse it from thousands of different kinds of labeling.
Unfortunately some records don't contain all the information I needed.