Skip to content
Change the repository type filter

All

    Repositories list

    • es-tools

      Public
      Elasticsearch tools developed by the Media Cloud project
      Python
      Apache License 2.0
      1000Updated Dec 26, 2024Dec 26, 2024
    • Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.
      JavaScript
      Apache License 2.0
      1510480Updated Dec 23, 2024Dec 23, 2024
    • devops tools
      Python
      Apache License 2.0
      1010Updated Dec 22, 2024Dec 22, 2024
    • The core pipeline used to ingest online news stories in the Media Cloud archive.
      Python
      Apache License 2.0
      53424Updated Dec 22, 2024Dec 22, 2024
    • How Media Cloud approaches extracting metadata from online news stories
      Python
      Apache License 2.0
      51250Updated Dec 22, 2024Dec 22, 2024
    • UNDER CONSTRUCTION - A package containing a library of issue validators in a flexibly deployable wrapper.
      Jupyter Notebook
      1060Updated Dec 16, 2024Dec 16, 2024
    • Public client for consuming content from the Media Cloud Online News Archive & Directory.
      Python
      Apache License 2.0
      297231Updated Dec 10, 2024Dec 10, 2024
    • Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.
      Python
      Apache License 2.0
      66111Updated Dec 5, 2024Dec 5, 2024
    • Internal library to allow querying multiple media platforms with a consistent API.
      Python
      2150Updated Nov 4, 2024Nov 4, 2024
    • Internal API server that offers search access to the Media Cloud Online News Archive (in Elasticsearch).
      Python
      GNU Affero General Public License v3.0
      31100Updated Oct 25, 2024Oct 25, 2024
    • sous-chef

      Public
      Configurable Data Analytics Pipeline
      Python
      0190Updated Oct 21, 2024Oct 21, 2024
    • An internal client library to access the new Mediacloud news archive search.
      Python
      Apache License 2.0
      2031Updated Oct 10, 2024Oct 10, 2024
    • Find rss, atom, xml, and rdf feeds on webpages
      Python
      MIT License
      133041Updated Oct 10, 2024Oct 10, 2024
    • simple toolkit of tools for consuming sitemaps
      Python
      Apache License 2.0
      1420Updated Oct 9, 2024Oct 9, 2024
    • mc-manage

      Public
      Python
      0000Updated Oct 8, 2024Oct 8, 2024
    • sc-buffet

      Public
      Sous-chef buffet - Self-service data access for sous-chef.
      Python
      0050Updated Oct 3, 2024Oct 3, 2024
    • Daily performance metrics for the mediacloud application
      Python
      0010Updated Sep 20, 2024Sep 20, 2024
    • A Python client for the CLIFF geoparsing tool
      Python
      MIT License
      5501Updated May 21, 2024May 21, 2024
    • A client library to access the Wayback Machine news archive search.
      Python
      Apache License 2.0
      2410Updated Dec 15, 2023Dec 15, 2023
    • web-tools

      Public archive
      The shared repository for Media Cloud web apps (Explorer, Source Manager, Topic Mapper)
      JavaScript
      Apache License 2.0
      3064314Updated Dec 14, 2023Dec 14, 2023
    • A set of jupyter notebooks demonstrating how to use the Media Cloud API.
      Jupyter Notebook
      143500Updated Dec 13, 2023Dec 13, 2023
    • backend

      Public archive
      Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.
      Python
      GNU Affero General Public License v3.0
      8728213125Updated Nov 20, 2023Nov 20, 2023
    • Dokku app that serves a static HTML catch-all page, displayed for bad domains
      HTML
      0000Updated Oct 25, 2023Oct 25, 2023
    • A simple homepage for the CLIFF project
      HTML
      MIT License
      1100Updated May 30, 2023May 30, 2023
    • Tag news stories based on models trained on the NYT corpus.
      Python
      Apache License 2.0
      134216Updated Mar 1, 2023Mar 1, 2023
    • Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
      Python
      Other
      2923322Updated Nov 7, 2022Nov 7, 2022
    • glimpse

      Public archive
      Get a glimpse of attention to a topic on social media.
      Python
      Apache License 2.0
      2280Updated Sep 19, 2022Sep 19, 2022
    • Helpful micro-service to return results from word2vec models
      Python
      MIT License
      4200Updated Jul 29, 2022Jul 29, 2022
    • cliff-annotator

      Public archive
      A lightweight server to allow HTTP requests to the Stanford Named Entity Recognized and a heavily modified CLAVIN geoparser.
      Java
      Apache License 2.0
      35119910Updated May 20, 2022May 20, 2022
    • Notebook demonstrating how to create and update a Media Cloud collection.
      Jupyter Notebook
      0000Updated Mar 30, 2022Mar 30, 2022