Now also saves bias layers #193

opfromthestart · 2023-02-22T16:12:56Z

What does this PR accomplish?

It saves bias layers along with ordinary weight layers, allowing models to be saved and loaded from disk.

🩹 Bug Fix

Closes #188 .

Changes proposed by this PR:

I add another name to the list of names within a layer to account for the bias weights.

Notes to reviewer:

📜 Checklist

Test coverage is excellent
All unit tests pass
The juice-examples run just fine
Documentation is thorough, extensive and explicit

drahnr · 2023-02-22T16:46:01Z

juice/src/layer.rs

@@ -925,6 +933,8 @@ impl<'a, B: IBackend> CapnpWrite<'a> for Layer<B> {
            let names = self.learnable_weights_names();
            let weights_data = self.learnable_weights_data();

+            assert_eq!(names.len(), weights_data.len(), "Not all layers are named");


Suggested change

assert_eq!(names.len(), weights_data.len(), "Not all layers are named");

assert_eq!(names.len(), weights_data.len(), "All layers are named. qed");

Changed to "All layers must be named".

drahnr · 2023-02-22T16:48:26Z

Thank you!

drahnr · 2023-02-22T16:58:59Z

The only thing left would be does this work with the store/load unit test added in #190 ?

* Fix coaster UI tests (rustc error messages changed in 1.62 (#172) * Fix Linear layer bias gradient computation; add size checks to CUDA functions (#170) * Assert the correct tensor sizes in copy() and gemm(); fix related Linear logic * Check output matrix dims in GEMM; fix corresponding Linear layer logic * Update coaster-blas/src/frameworks/cuda/helper.rs * Fix merge mistake in commit 6952a49 (#173) * doc: clarify remote test (#175) * bump rust-bindgen to 0.60.1, bump cargo lock file (#174) * build(deps): bump capnp from 0.14.9 to 0.14.11 (#179) Bumps [capnp](https://github.com/capnproto/capnproto-rust) from 0.14.9 to 0.14.11. - [Release notes](https://github.com/capnproto/capnproto-rust/releases) - [Commits](capnproto/capnproto-rust@capnp-v0.14.9...capnp-v0.14.11) --- updated-dependencies: - dependency-name: capnp dependency-type: direct:production ... * build(deps): bump tokio from 1.21.0 to 1.23.1 (#183) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.0 to 1.23.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.21.0...tokio-1.23.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production ... * build(deps): bump bumpalo from 3.11.0 to 3.12.0 (#187) Bumps [bumpalo](https://github.com/fitzgen/bumpalo) from 3.11.0 to 3.12.0. - [Release notes](https://github.com/fitzgen/bumpalo/releases) - [Changelog](https://github.com/fitzgen/bumpalo/blob/main/CHANGELOG.md) - [Commits](fitzgen/bumpalo@3.11.0...3.12.0) --- updated-dependencies: - dependency-name: bumpalo dependency-type: indirect ... * build(deps): bump tokio from 1.23.1 to 1.24.2 (#191) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.23.1 to 1.24.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production ... * Now also saves bias layers (#193) * build(deps): bump openssl from 0.10.41 to 0.10.48 Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.41 to 0.10.48. - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](sfackler/rust-openssl@openssl-v0.10.41...openssl-v0.10.48) updated-dependencies: - dependency-name: openssl dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> * Do not pass batch_size to cudnnGetRNNParamsSize(). * Add a feature for deterministic (pseudo)randomizing. * New network architecture pieces: Layer, Descriptor, Context, Network (#165) * New network architecture pieces: Layer, Descriptor, Context, Network * Update juice/src/net/descriptor.rs * Implement Sequential layer for the new architecture (#168) * Implement Sequential layer * Fix coaster UI tests (rustc error messages changed in 1.62 (#172) * Fix Linear layer bias gradient computation; add size checks to CUDA functions (#170) * Assert the correct tensor sizes in copy() and gemm(); fix related Linear logic * Check output matrix dims in GEMM; fix corresponding Linear layer logic * Update coaster-blas/src/frameworks/cuda/helper.rs * More ergonomic net creation and fallible Sequential constructor * Fix merge mistake in commit 6952a49 * Add a few more layers to the new architecture (#176) * Add trainer subsystem with SGD and Adam optimizers (#177) * Coaster convolution API cleanup (#178) * Move Convolution workspace into context * Implement Convolution, Dropout and Pooling layers (#180) * Move Convolution workspace into context * Formatting fixes * Fixed unit tests * Partial implementation of the Convolution layer * Implement the remaining parts for Convolution layer * Implement dropout and pooling layers * Fix CUDA tensor descriptor size error and adjust layer testing infra * Extended debug output for layers with custom Debug impl * Add softmax layers and convert MNIST example (#184) * Move Convolution workspace into context * Formatting fixes * Fixed unit tests * Partial implementation of the Convolution layer * Implement the remaining parts for Convolution layer * Implement dropout and pooling layers * Fix CUDA tensor descriptor size error and adjust layer testing infra * Extended debug output for layers with custom Debug impl * Changed mnist example to the new architecture * Plumbed the momentum arg in the mnist example * Implemented softmax and logsoftmax layers * Remove unnecessary NLL parameter and fix mnist example * Fix native backend softmax and logsoftmax grad computation * Changed slicing syntax in native backend softmax functions * Convert juice benchtests to Criterion (#192) * Convert Juice benchmarks to Criterion * Add newline at the end of Cargo.toml * Made Layer operations return a Result (#186) * Made Layer operations return a Result * Change LayerError to contain Boxes * Update benchmarks for new layer API * Simplify new_rnn_config() Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Mikhail Balakhno <{ID}+{username}@users.noreply.github.com> Co-authored-by: Bernhard Schuster <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: opfromthestart <[email protected]>

Added saving of bias layers

31cd306

drahnr reviewed Feb 22, 2023

View reviewed changes

drahnr approved these changes Feb 22, 2023

View reviewed changes

Better error message

76f6b4a

drahnr merged commit 5316773 into fff-rs:master Feb 22, 2023

opfromthestart deleted the real-save branch February 22, 2023 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Now also saves bias layers #193

Now also saves bias layers #193

opfromthestart commented Feb 22, 2023 •

edited by drahnr

Loading

drahnr Feb 22, 2023

opfromthestart Feb 22, 2023

drahnr commented Feb 22, 2023

drahnr commented Feb 22, 2023 •

edited

Loading

	assert_eq!(names.len(), weights_data.len(), "Not all layers are named");
	assert_eq!(names.len(), weights_data.len(), "All layers are named. qed");

Now also saves bias layers #193

Now also saves bias layers #193

Conversation

opfromthestart commented Feb 22, 2023 • edited by drahnr Loading

What does this PR accomplish?

Changes proposed by this PR:

Notes to reviewer:

📜 Checklist

drahnr Feb 22, 2023

Choose a reason for hiding this comment

opfromthestart Feb 22, 2023

Choose a reason for hiding this comment

drahnr commented Feb 22, 2023

drahnr commented Feb 22, 2023 • edited Loading

opfromthestart commented Feb 22, 2023 •

edited by drahnr

Loading

drahnr commented Feb 22, 2023 •

edited

Loading