PyMR

PyMR is a small python wrapper for map-reduce programming in hadoop streaming mode. It emulates many operations as Java MR framework. One can simply override map() or reduce() method to make code running in streaming mode.
https://github.com/kn45/PyMR

Features

Never bother with input, group by key, which have been handled by PyMR. Just override such method:

Mapper:
_setup(self), _map(self, inl), _cleanup(self)
Reducer:
_setup(self), _reduce(self, key, values), _cleanup(self)
self.delim, self.key_fields

Example

cat test_in | python test.py m | sort | python test.py r

Todo

Lots. Add more functions such as counter, env_getter etc.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
LICENSE		LICENSE
README.md		README.md
pymr.py		pymr.py
test.py		test.py
test_in		test_in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyMR

Features

Example

Todo

About

Releases

Packages

Languages

License

kn45/pymr

Folders and files

Latest commit

History

Repository files navigation

PyMR

Features

Example

Todo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages