hide path file storage from user/OS #259
Labels
Contribution opportunity
Might be a good opportunity for somebody to contribute to AequilibraE
enhancement
New feature or request
Purpose of use
Hide storage of path files from operating system and from user.
Is your feature request related to a problem? Please describe.
#253 introduces path file saving during assignment. It is done by writing two feather files per traffic_class, iteration, and origin - we generate a list of paths from the specified origin to all destinations, and keep an index file into that list so that we can retrieve paths for a specific destination. This means there are many, many files on the file system, for example the following order of magnitude is pretty common: 2 files * 3 classes * 200 iterations * 5000 zones = 6 million files, which can make it hard to share results and slow to delete them.
Describe the solution you'd like
Pedro and I discussed using an sqlite database with blobs to which we dump the feather bytes, see e.g. https://stackoverflow.com/questions/50761777/convert-pandas-dataframe-to-from-in-memory-feather/58980160#58980160 and http://sqlite.1065341.n5.nabble.com/Effect-of-blobs-on-performance-td19559.html#a19571.
Describe alternatives you've considered
HDF5 without going to feather, although my experience with HDF5 is that it's cumbersome to deal with (this is tainted by using it from C++ to be honest) and it is not ideal compression wise. That's why I decided on arrow and feather in the first place. I have not, however, tried to store paths by origin and destination (our long list and the index into it are due to arrow's columnar nature, paths are variable length), maybe HDF5 can be useful.
The text was updated successfully, but these errors were encountered: