statscache

A daemon to build and keep fedmsg statistics.

Motivation: we have a neat service called datagrepper with which you can query the history of the fedmsg bus. It is cool, but insufficient for some more advanced reporting and analysis that we would like to do. Take, for example, the releng-dash. In order to render the page, it has to make a dozen or more requests to datagrepper to try and find the 'latest' events from large awkward pages of results. Consequently, it takes a long time to load.

Enter, statscache. It is a plugin to the fedmsg-hub that sits listening in our infrastructure. When new messages arrive, it will pass them off to plugins that will calculate and store various statistics. If we want a new kind of statistic to be kept, we write a new plugin for it. It will come with a tiny flask frontend, much like datagrepper, that allows you to query for this or that stat in this or that format (csv, json, maybe html or svg too but that might be overkill). The idea being that we can then build neater smarter frontends that can render fedmsg-based activity very quickly.. and perhaps later drill-down into the details kept in datagrepper.

It is kind of like a data mart.

How to run it

Create a virtualenv, and git clone this directory and the statscache_plugins repo.

Run python setup.py develop in the statscache dir first and then run it in statscache_plugins.

In the main statscache repo directory, run fedmsg-hub to start the daemon. You should see lots of fun stats being stored in stdout. To launch the web interface (which currently only serves JSON and CSV responses), run python statscache/app.py in the same directory. You can now view a list of the available plugins in JSON by visiting localhost:5000/api/, and you can retrieve the statistics recorded by a given plugin by appending its identifier to that same URL.

You can run the tests with python setup.py test.

How it works

When a message arrives, a fedmsg consumer receives it and hands a copy to each loaded plugin for processing. Each plugin internally caches the results of this message processing until a polling producer instructs it to update its database model and empty its cache. The frequency at which the polling producer does so is configurable at the application level and is set to one second by default.

There are base sqlalchemy models that each of the plugins should use to store their results (and we can add more types of base models as we discover new needs). But the important thing to know about the base models is that they are responsible for knowing how to serialize themselves to different formats for the REST API (e.g., render .to_csv() and .to_json()).

Even though statscache is intended to be a long-running service, the occasional reboot is inevitable. However, having breaks in the processed history of fedmsg data may lead some plugins to produce inaccurate statistics. Luckily, statscache comes built-in with a mechanism to transparently handle this. On start-up, statscache checks the timestamp of each plugin's most recent database update and queries datagrepper for the fedmsg data needed to fill in any gaps. On the other hand, if a plugin specifically does not need a continuous view of the fedmsg history, then it may specify a "backlog delta," which is the maximum backlog of fedmsg data that would be useful to it.

Name		Name	Last commit message	Last commit date
Latest commit History 273 Commits
apache		apache
docs/diagrams		docs/diagrams
examples		examples
fedmsg.d		fedmsg.d
statscache		statscache
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.rst		CHANGELOG.rst
COPYING		COPYING
COPYING.LESSER		COPYING.LESSER
MANIFEST.in		MANIFEST.in
README.rst		README.rst
requirements.txt		requirements.txt
requirements_test.txt		requirements_test.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

statscache

How to run it

How it works

About

Licenses found

Releases

Packages

Contributors 6

Languages

License

Licenses found

fedora-infra/statscache

Folders and files

Latest commit

History

Repository files navigation

statscache

How to run it

How it works

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages