Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved:
- Join the mailing list: send an email to [email protected]. Share your ideas and use cases for the project
- Follow our activity on GitHub issues
- Learn the format
- Contribute code to one of the reference implementations
We prefer to receive contributions in the form of GitHub pull requests. Please send pull requests against the github.com/apache/arrow repository.
If you are looking for some ideas on what to contribute, check out the GitHub issues for the Apache Arrow project. Comment on the issue and/or contact [email protected] with your questions and ideas.
If you’d like to report a bug but don’t have time to fix it, you can still post it on GitHub issues, or email the mailing list [email protected]
We use yarn to install dependencies and run scrips.
yarn clean
- cleans targetsyarn build
- cleans and compiles all targetsyarn test
- executes tests against built targets
These scripts accept argument lists of targets × modules:
- Available
targets
arees5
,es2015
,esnext
,ts
, andall
(default:all
) - Available
modules
arecjs
,esm
,umd
, andall
(default:all
)
Examples:
yarn build
-- builds all ES targets in all module formatsyarn build -t es5 -m all
-- builds the ES5 target in all module formatsyarn build -t all -m cjs
-- builds all ES targets in the CommonJS module formatyarn build -t es5 -t es2015 -m all
-- builds the ES5 and ES2015 targets in all module formatsyarn build -t es5 -m cjs -m esm
-- builds the ES5 target in CommonJS and ESModules module formats
This argument configuration also applies to clean
and test
scripts.
To run tests on the bundles, you need to build them first.
To run tests directly on the sources without bundling, use the src
target (e.g. yarn test -t src
).
yarn doc
Compiles the documentation with Typedoc. Use yarn doc --watch
to automatically rebuild when the docs change.
You can run the benchmarks with yarn perf
. To print the results to stderr as JSON, add the --json
flag (e.g. yarn perf --json 2> perf.json
).
You can change the target you want to test by changing the imports in perf/index.ts
. Note that you need to compile the bundles with yarn build
before you can import them.
The bundles use apache-arrow
so make sure to build it with yarn build -t apache-arrow
. To bundle with a variety of bundlers, run yarn test:bundle
or yarn gulp bundle
.
Run yarn gulp bundle:webpack:analyze
to open Webpack Bundle Analyzer.
-
Once generated, the flatbuffers format code needs to be adjusted for our build scripts (assumes
gnu-sed
):cd $ARROW_HOME # Create a tmpdir to store modified flatbuffers schemas tmp_format_dir=$(mktemp -d) cp ./format/*.fbs $tmp_format_dir # Remove namespaces from the flatbuffers schemas sed -i '+s+namespace org.apache.arrow.flatbuf;++ig' $tmp_format_dir/*.fbs sed -i '+s+org.apache.arrow.flatbuf.++ig' $tmp_format_dir/*.fbs # Generate TS source from the modified Arrow flatbuffers schemas flatc --ts -o ./js/src/fb $tmp_format_dir/{File,Schema,Message,Tensor,SparseTensor}.fbs # Remove the tmpdir rm -rf $tmp_format_dir
-
Manually fix the unused imports and add // @ts-ignore for other errors
-
Add
.js
to the imports. In VSCode, you can search for^(import [^';]* from '(\./|(\.\./)+)[^';.]*)';
and replace with$1.js';
. -
Execute
yarn lint
from thejs
directory to fix the linting errors