This Python library implements connecting to Swarm (BZZ) distributed storage network.
Learn more about MiPasa: https://www.mipasa.com/
Learn more about Swarm: https://www.ethswarm.org/
mipasa_swarm_connector[pandas]
is required if you wish to read files as CSV or Parquet.mipasa_swarm_connector[parquet]
is required if you wish to read files as Parquet. More specific versions of this dependency exist:mipasa_swarm_connector[parquet-pyarrow]
mipasa_swarm_connector[parquet-fastparquet]
In order to work with Swarm, you will need to specify a Bee node address.
For production environments, we advise doing so via setting the environment variable BEE_GATEWAY_URL
to the HTTP endpoint of the chosen node.
For development, you can also specify the node address directly:
from mipasa_swarm_connector import SwarmConnection
content = SwarmConnection('http://localhost:1633').read_file('36f0830b0ece6273a50cc0ff58c4597883e8e41ea9c8ddabb0d87d6b0ca95a1a')
print(repr(content))
from mipasa_swarm_connector import SwarmConnection, SwarmAPIError
try:
content = SwarmConnection().read_file('<your_swarm_file_hash_here>')
print(repr(content))
except SwarmAPIError as e:
print('Failed to fetch, status: %d' % e.status_code)
This code will read the contents of the specified file on Swarm.
By default, the exact file type will be autodetected:
- If the file has a known file type (which is
application/json
,text/csv
,application/vnd.apache.parquet
andapplication/parquet
) it will be reintepreted as a JSON object or a Pandas DataFrame. - Otherwise, the file will be downloaded as bytes.
Any use of read_file
can raise SwarmAPIError
, which, generally, means that a file is not found (status code 404
).
from mipasa_swarm_connector import SwarmConnection, SwarmTypeError
try:
dataframe = SwarmConnection().read_file('<your_swarm_file_hash_here>', as_type='csv', verify_type=True)
print(repr(dataframe))
except SwarmTypeError as e:
print('Expected type %s, got type %s' % (e.expected_type, e.actual_type))
This code will read the contents of the specified file on Swarm as a Pandas DataFrame, expecting it to be in CSV format.
verify_type=True
can be passed (but not required) in order to check whether file's MIME type or filename matches CSV.
from mipasa_swarm_connector import SwarmConnection
dataframe = SwarmConnection().read_file('<your_swarm_file_hash_here>', as_type='parquet')
print(repr(dataframe))
When type-checking for Parquet, two MIME types are recognized:
application/vnd.apache.parquet
application/parquet
from mipasa_swarm_connector import SwarmConnection
dataframe = SwarmConnection().read_file('<your_swarm_file_hash_here>', as_type='json')
print(repr(dataframe))
from mipasa_swarm_connector import SwarmConnection, SwarmAPIError
try:
swarm_hash = SwarmConnection().write_file(b'binary-content')
except SwarmAPIError as e:
print('Failed to upload, status: %d' % e.status_code)
This code will attempt uploading the specified bytes to Swarm.
By default, the exact type of the uploaded file will be autodetected:
- If
bytes
orbytearray
is passed, then the uploaded file type will beapplication/octet-stream
. - If a string is passed, it will be encoded as UTF-8 and the file type will be
text/plain
. - If a Pandas DataFrame is passed, then the uploaded file type will be
text/csv
and the DataFrame will be formatted as CSV. - If any other object is passed, JSON encoding will be attempted. If it succeeds, then the uploaded file type will be
application/json
.
Additionally, if your content is a Pandas DataFrame, you may specify as_type='parquet'
in order to automatically format this file as Parquet (the file type will be application/vnd.apache.parquet
).