Fork the ARFlow repository into your account: https://github.com/cake-lab/ARFlow/fork
ARFlow uses poetry
for dependency management. Install it here.
Clone the forked repository:
git clone https://github.com/{your-account}/ARFlow.git
cd ARFlow/python
poetry install
ARFlow uses ruff
for linting and formatting. We also use pyright
for type checking. Make sure you have the appropriate extensions or corresponding LSPs installed in your editor.
These tools should run automatically in your editor. If you want to run them manually, you can also use the following commands:
poetry run ruff check # check for linting errors
poetry run ruff check --fix # check for linting errors and fix them
poetry run ruff format # format the code
All of these quality checks are run automatically before every commit using pre-commit
. To install the pre-commit hooks, run:
poetry run pre-commit install
To manually invoke the pre-commit checks, run:
poetry run pre-commit run --all-files
Library authors are encouraged to prioritize bringing their public API to 100% type coverage. Although this is very hard in ARFlow's case due to our dependency on gRPC
, we should still strive to achieve this goal. To check for type completeness, run:
poetry run pyright --ignoreexternal --verifytypes arflow
To read more about formalizing libraries' public APIs, please refer to this excellent blog post by Dagster.
ARFlow uses pytest
. Make sure you are in the python
directory and then run tests with:
poetry run pytest
- Log key events for debugging and tracking.
- Avoid logging sensitive information (e.g., user data).
- Initialize a logger in each module using
logger = logging.getLogger(__name__)
. This enables granular logging and gives users control over logs from specific parts of the library. - Use appropriate log levels:
Level | Usage |
---|---|
debug() |
Detailed internal state info |
info() |
General operational events |
warning() |
Unexpected events, non-fatal |
error() |
Errors, exceptions |
Example:
logger = logging.getLogger(__name__)
logger.debug("Processing request: %s", request_id)
ARFlow uses GitHub Actions for continuous integration. The CI pipeline runs the following checks:
poetry run ruff check # linting
poetry run pyright arflow # type checking
poetry run pytest # testing
Python dependency management.
ARFlow uses poetry
to manage dependencies and run commands. Commands can be found in the pyproject.toml
file in the [tool.poetry.scripts]
section and can be run via poetry run <command>
.
A language-neutral, platform-neutral, extensible mechanism for serializing structured data.
ARFlow uses protobuf
to define the communication protocol between the server and the client. The protocol is defined in service.proto
and can be compiled using compile.sh
.
Implements binary protocols for serializing and deserializing Python objects. Pickling is the same as serialization, marshalling, or flattening in other languages. The inverse operation is called unpickling.
A library to write concurrent code using using the async
and await
syntax. Perfect for writing IO-bound and high-level structured network code.
A tool to build time aware visualizations of multimodal data.
ARFlow uses the Rerun Python SDK to visualize the data collected by the ARFlow server.
ARFlow uses pdoc
. You can refer to their documentation for more information on how to generate documentation.
To preview the documentation locally, run:
poetry run pdoc arflow examples # or replace with module_name that you want to preview
The ARFlow server and client communicates through gRPC. Here are some best practices to keep in mind when working with gRPC:
All fields in proto3
are optional, so you’ll need to validate that they’re all set. If you leave one unset, then it’ll default to zero for numeric types or to an empty string for strings.
gRPC is built on top of HTTP/2, the status code is like the standard HTTP status code. This allows clients to take different actions based on the code they receive. Proper error handling also allows middleware, like monitoring systems, to log how many requests have errors.
ARFlow uses the grpc_interceptor
library to handle exceptions. This library provides a way to raise exceptions in your service handlers, and have them automatically converted to gRPC status codes. Check out an example usage here.
grpc_interceptor
also provides a testing framework to run a gRPC service with interceptors. You can check out the example usage here.
To achieve backward compatibility, you should never remove a field from a message. Instead, mark it as deprecated and add a new field with the new name. This way, clients that use the old field will still work.
We use buf
to lint our protobuf files. You can install it by following the instructions here.
We use pyright
and grpc-stubs
to type check our Protobuf-generated code.
When the server is shutting down, it should wait for all in-flight requests to complete before shutting down. This is to prevent data loss or corruption. We have done this in the ARFlow server.
gRPC supports TLS encryption out of the box. We have not implemented this in the ARFlow server yet. If you are interested in working on this, please let us know.
VSCode may force changes the locale to en_US.UTF-8
for git commit hooks. To fix this, run:
sudo locale-gen en_US.UTF-8
Please refer to their documentation documentation.