Hash-worker is a library for fast calculation of file chunk hashes.
It is based on hash-wasm
and utilizes WebWorkers
for parallel computation, which speeds up computation when
processing file blocks.
Hash-worker supports three hash computation algorithms: md5
, crc32
and xxHash64
.
Both browser
and Node.js
are supported.
Warning
The merkleHash computed by the Hash-worker is the root hash of a MerkleTree constructed based on file block hashes. Note that this is not directly equivalent to a hash of the file itself.
$ pnpm install hash-worker
<script src="./global.js"></script>
<script src="./worker/hash.worker.mjs"></script>
<script>
HashWorker.getFileHashChunks()
</script>
The global.js
and hash.worker.mjs
are the build artifacts resulting from executing build:core
in package.json
.
The build artifacts are located in the packages/core/dist
directory.
import { getFileHashChunks, destroyWorkerPool, HashChksRes, HashChksParam } from 'hash-worker'
function handleGetHash(file: File) {
const param: HashChksParam = {
file: file,
config: {
workerCount: 8,
strategy: Strategy.md5
}
}
getFileHashChunks(param).then((data: HashChksRes) => {
console.log('chunksHash', data.chunksHash)
})
}
/**
* Destroy Worker Thread
*/
function handleDestroyWorkerPool() {
destroyWorkerPool()
}
Warning
If you are using Vite
as your build tool, you need to add some configurations in your vite.config.js
to exclude hash-worker from optimizeDeps.
// vite.config.js
import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
export default defineConfig({
plugins: [vue()],
// other configurations ...
optimizeDeps: {
exclude: ['hash-worker'] // new added..
}
})
Warning
If you are using Webpack
as your build tool, you need add some configs in your webpack.config.js
for exclude the parsing of node related modules.
// webpack.config.js
module.exports = {
// new added..
resolve: {
fallback: {
fs: false,
path: false,
'fs/promises': false,
worker_threads: false,
},
},
// new added..
externals: {
fs: 'commonjs fs',
path: 'commonjs path',
'fs/promises': 'commonjs fs/promises',
worker_threads: 'commonjs worker_threads',
},
}
HashChksParam
HashChksParam is used to configure the parameters needed to calculate the hash.
filed | type | default | description |
---|---|---|---|
file | File | / | Files that need to calculate the hash (required for browser environments) |
filePath | string | / | Path to the file where the hash is to be calculated (required for Node environments) |
config | Config | Config | Parameters for calculating the Hash |
Config
filed | type | default | description |
---|---|---|---|
chunkSize | number | 10 (MB) | Size of the file slice |
workerCount | number | 8 | Number of workers turned on at the same time as the hash is calculated |
strategy | Strategy | Strategy.mixed | Hash computation strategy |
borderCount | number | 100 | The cutoff for the hash calculation rule in 'mixed' mode |
isCloseWorkerImmediately | boolean | true | Whether to destroy the worker thread immediately when the calculation is complete |
// strategy.ts
export enum Strategy {
md5 = 'md5',
crc32 = 'crc32',
xxHash64 = 'xxHash64',
mixed = 'mixed',
}
When Strategy.mixed strategy is used, if the number of file fragments is less than borderCount, the md5 algorithm will be used to calculate the hash value to build the MerkleTree. Otherwise, it switches to using the crc32 algorithm for MerkleTree construction.
HashChksRes
HashChksRes is the returned result after calculating the hash value.
filed | type | description |
---|---|---|
chunksBlob | Blob[] | In a browser environment only, the Blob[] of the file slice is returned |
chunksHash | string[] | Hash[] for file slicing |
merkleHash | string | The merkleHash of the file |
metadata | FileMetaInfo | The metadata of the file |
FileMetaInfo
filed | type | description |
---|---|---|
name | string | The name of the file used to calculate the hash |
size | number | File size in KB |
lastModified | number | Timestamp of the last modification of the file |
type | string | file extension |
Worker Count | Speed |
---|---|
1 | 229 MB/s |
4 | 632 MB/s |
8 | 886 MB/s |
12 | 1037 MB/s |
The above data is run on the Chrome v131
and AMD Ryzen9 5950X
CPU, by using md5 to calculate hash.
Contributions are welcome! If you find a bug or want to add a new feature, please open an issue or submit a pull request.