Home

Welcome to the hadoopcryptoledger wiki!

hadoopcryptoledger is a library for processing crypto ledgers, such as the Bitcoin and Ethereum blockchain, on Hadoop and ecosystem components (e.g. Spark/Hive). It allows you analysing them and combining them with other data, such as stock markets, criminal evidence or weather patterns.

It contains the following components:

Hadoop File Format to enable any MapReduce/Tez/Spark application to read blocks and transactions from files containing crypto ledger data in HDFS. This format supports the original mapreduce api (mapred.*) and the alternative mapreduce api (mapreduce.*)
- Javadoc, Unit Test Results, Mutation Test Results (deactivated until PITest supports JUnit5), Security: OWASP Dependency Check (Note: All results related to HadoopLibraries dependent on the version used by your Hadoop distribution!)
Hive Serde for making blocks and transactions from files containing crypto ledger data in HDFS available as tables in Hive
- Javadoc, Unit Test Results
Hive UDF for providing CryptoLedger specific functionality to facilitate working with them in Hive
- Javadoc, Unit Test Results
Spark Datasource to use the HadoopCryptoLedger library via the Spark DataSource
Flink Datasource to use the HadoopCryptoLedger library in Apache Flink
- Javadoc, Unit Test Results

Currently supported JDKs: 7 and 8.

Find here some HowTo-Guides:

Bitcoin and Altcoins
- MapReduce: Count the number of transactions from files containing Bitcoin Blockchain data
- MapReduce: Count the total number of inputs of all transactions from files containing Bitcoin Blockchain data
- Spark: Use Spark to count the number of transactions from files containing Bitcoin Blockchain data
- Spark: Use Spark and Scala with Bitcoin Blockchain data
- Hive: Using Hive to analyze Bitcoin Blockchain data
- Hive: Use the HadoopCryptoLedger UDF to ease processing of Bitcoin specific data in Hive
- Spark: Using Spark+Scala+Graphx to analyze the Bitcoin transaction graph
- Spark: Use HadoopCrytoLedger library as Spark datasource
- Flink: Analyzing the Bitcoin Blockchain with Apache Flink
- Spark: Analyse Litecoin data using Apache Spark
- Spark: Analyse Namecoin data using Apache Spark
- Hive: Use Hive to analyse Namecoin data
- Support for Altcoins based on Bitcoin (e.g. Litecoin, Namecoin)
Ethereum and Altcoins
Fetching Blockchain data - fetch blockchain data for analysis: Bitcoin, Ethereum, NameCoin, Litecoin etc.
Useful Utility functions - for analysing Blockchains
Recommended practice: ELT HIVE process for analyzing blockchain

Find here the status from the continuous integration (CI) platform:

https://travis-ci.org/ZuInnoTe/hadoopcryptoledger

Find here the status from the static code analyzer platform:

Sonarqube: https://sonarcloud.io/dashboard?id=ZuInnoTe%3Ahadoopcryptoledger
Codacy (includes also Scala): https://www.codacy.com/app/jornfranke/hadoopcryptoledger

Find here the OpenHub report.

Find here some release notes

Join us on Gitter.im

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally