The blockchain_address_label_repository
is a resource for primarily blockchain address label datasets and APIs, providing a structured collection of labelled address data, and access instructions for APIs and Datasets. Labels of Blockchain addresses can substantially help with sense-making as they enable the linking of real-world entities with on-chain activity. This repository's goal is to serve as a comprehensive resource for researchers, developers, and analysts who require a wide range of address data to enhance their research or study address behaviour. For specific project data, the best practice is still to refer to the project documentation or contact the team directly.
README.md
: Provides an overview of the repository, including its purpose, how to navigate the datasets, and instructions for using the APIs.DATASETS.md
: Lists all the static datasets available with descriptions, metadata, and download links.API_DOCS.md
: Detailed documentation for all the API endpoints, including usage examples, rate limits, and access requirements.CONTRIBUTING.md
: Outlines guidelines for contributing to the repository, including data submission, code additions, and documentation updates.LICENSE
: Describes the usage rights and restrictions associated with the repository's content.
- Curated List of Datasets: Enumerates datasets with labelled Ethereum addresses, providing details on size, scope, format, and other relevant metadata.
- API Integration: Offers guidelines for accessing real-time data through APIs, complete with sample queries and response formats.
- Contribution Guidelines: Encourages community contributions and outlines the process for submitting new data or updating existing entries.
Researchers and analysts can utilise this repository to access datasets and APIs for studies on Address Wallet labels. Selected examples of academic publications utilising one or more of the data sources listed within the repository are listed below:
Reference | Dataset/API |
---|---|
Victor, Friedhelm. "Address Clustering Heuristics for Ethereum". In Financial Cryptography and Data Security, edited by Joseph Bonneau and Nadia Heninger, Lecture Notes in Computer Science, vol. 12059, pp. 617-633. Cham: Springer International Publishing, 2020. Access via: https://doi.org/10.1007/978-3-030-51280-4_33. | Etherscan Labels for identification of Centralised Exchange Address as Ground Truth to develop Clustering Heuristics for Ethereum Addresses |
Nadler, Matthias, and Fabian Schär. "Decentralized Finance, Centralized Ownership? An Iterative Mapping Process to Measure Protocol Token Distribution". arXiv, 16 December 2020. Access via: http://arxiv.org/abs/2012.09306). | Refinement of study scope using Nansen, Etherscan Labels |
Béres, Ferenc, István András Seres, András A. Benczúr, and Mikerah Quintyne-Collins. ‘Blockchain Is Watching You: Profiling and Deanonymizing Ethereum Users’. arXiv, 13 October 2020. http://arxiv.org/abs/2005.14051. | The study quantitatively evaluates graph representation learning algorithms and user profiling techniques based on Ethereum Name Service identifiers, transaction times, and fees. |
For support, inquiries, or contributions, please open an issue on the repository or submit a pull request as per the guidelines outlined in CONTRIBUTING.md
. Collaboration is highly encouraged, and we welcome new sources of data or improvements to the repository.
The datasets and APIs within this repository are intended for academic and research purposes only. Labelled addresses are linked to entities or public figures and should not be used to identify or target individuals. Users must respect privacy and comply with all applicable laws and ethical guidelines when utilizing and interpreting this data. Misuse of this information is strictly prohibited. Any dataset conflicting with this will not be added to this repository.