File handler of 立委投票指南
postgresql-9.5
pandas >= 0.18 (sudo apt-get install python-pandas)
(1) 建立資料庫
(2) common/db_settings.py:資料庫 config 請自行設定
(3) update git submodule
$ git submodule init
$ git submodule update
(4) 建立/更新立委資料
$ python -m legislator.legislator
In begining of ad=9, source didn't provide uid of legislator, we maintain it ourself for temporary usage.
$ python -m legislator.legislator_uid_by_ourself
$ mv merged_uid_by_ourself.json to_where_you_want
Pass ad(屆期) to crawler, if output file already exist please remove it first manually, ad=9 for example:
bill/crawler$ rm -f bills_9.json
bill/crawler$ scrapy crawl lis_by_ad -a ad=9 -o bills_9.json -t json -s FEED_EXPORT_ENCODING=utf-8
$ python -m bill.parser_lis '{"ad": 9}'
$ python -m bill.law
vote_9 的9是立法院屆期
vote$ rm minutes.json
vote$ scrapy runspider meeting_minutes_crawler.py -o minutes.json -s FEED_EXPORT_ENCODING=utf-8
$ python -m vote.vote_9
$ python -m vote.vote_8
$ python -m vote.vote_7
$ python -m vote.vote_6
candidates_8 的8是立法院屆期, candidates_9 need excute after cec_api because of the drawno of candidate
$ python -m candidates.candidates_8
$ python -m candidates.cec_api
$ python -m candidates.candidates_cross_with_councilor
$ python -m candidates.candidates_9
candidates/political_contribution$ python political_contribution.py
only for ad=8, ad>8 use cec api for platform data.
legislator/platform$ python platform.py
CC0 1.0 Universal
This work is published from Taiwan.
about