Skip to content

🔥 Predictor of the political bias of German texts 🇩🇪

Notifications You must be signed in to change notification settings

axenov/politik-news

Repository files navigation

Political Bias Classification of German Media

This project is the first attemp to do Political Bias classification of German news.

We clawled out data from various German news sites using news-please library. After that we manually cleaned the data and labeled it using Medienkompass. Data is organised as HuggingFace nlp library dataset.

Due to the copyright issues we can not publish the data, but provided the list of urls you can use to build this dataset by your own. To download all the data run:

NewsPlease.from_file('data/urls.txt')

Then run (under development):

python preprocess.ty -data_folder='path/to/your/downloaded/data'

Our system uses German BERT from HuggingFace Transformers library as the pre-trained model to fine-tune.

To train model run:

python train.py -data_folder="data" model_folder="model" -batch_size=8 -num_epochs=2

To test model run:

python test.py -data_folder="data" model_folder="model"

t-SNE on SVD of BOW representation of the dataset



Effect of Covid-19 on German news



The web demo will be released soon.

About

🔥 Predictor of the political bias of German texts 🇩🇪

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published