Skip to content
This repository has been archived by the owner on Oct 10, 2024. It is now read-only.

An internal client library to access the new Mediacloud news archive search.

License

Notifications You must be signed in to change notification settings

mediacloud/mediacloud-news-client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wayback Machine News Archive Client

🚧 under construction 🚧

A simple client library to access the Wayback Machine news archive search.

Installation

pip install wayback-news-search

Basic Usage

Counting matching stories:

from waybacknews.searchapi import SearchApiClient
import datetime as dt

api = SearchApiClient("mediacloud")
api.count("coronavirus", dt.datetime(2022, 3, 1), dt.datetime(2022, 4, 1))

Paging over all matching results:

from waybacknews.searchapi import SearchApiClient
import datetime as dt

api = SearchApiClient("mediacloud")
for page in api.all_articles("coronavirus", dt.datetime(2022, 3, 1), dt.datetime(2022, 4, 1)):
    do_something(page)

Dev Installation

Install the dependencies for dev: pip install -e .[dev]

Distribution

  1. Run pytest to make sure all the test pass
  2. Update the version number in waybacknews/__init__.py
  3. Make a brief note in the version history section below about the changes
  4. Commit the changes
  5. Tag the commit with a semantic version number - 'v*..'
  6. Push to repo to GitHub
  7. Run python setup.py sdist to create an installation package
  8. Run twine upload --repository-url https://test.pypi.org/legacy/ dist/* to upload it to PyPI's test platform
  9. Run twine upload dist/* to upload it to PyPI

Version History

  • v1.0.3 - add 30 sec timeout, remove extra params mcproviders library might be adding
  • v1.0.2 - fix to article endpoint
  • v1.0.1 - automatically escape '/' in query strings, test case for url field search
  • v1.0.0 - update to public API endpoint
  • v0.1.5 - simpler return for top terms
  • v0.1.4 - better error handling
  • v0.1.3 - allow overriding base api URL
  • v0.1.2 - fix article endpoint, test case for fetching content (snippet) via article_url property
  • v0.1.1 - more consistent method names
  • v0.1.0 - initial test-only release

About

An internal client library to access the new Mediacloud news archive search.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%