From 39dbf2044716c04b954c24b89db0bab6ec650caa Mon Sep 17 00:00:00 2001 From: simonpcouch Date: Mon, 26 Aug 2024 08:39:14 -0500 Subject: [PATCH] add brief notes on package design --- .Rbuildignore | 1 + R/README.md | 9 +++++++++ README.Rmd | 2 ++ README.md | 2 ++ 4 files changed, 14 insertions(+) create mode 100644 R/README.md diff --git a/.Rbuildignore b/.Rbuildignore index e0da05a..cbe0310 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -2,6 +2,7 @@ ^\.Rproj\.user$ ^LICENSE\.md$ ^README\.Rmd$ +R/README.md ^cran-comments\.md$ ^src/\.cargo$ ^vignettes/articles$ diff --git a/R/README.md b/R/README.md new file mode 100644 index 0000000..7255c5c --- /dev/null +++ b/R/README.md @@ -0,0 +1,9 @@ +## Design of rinfa + +rinfa is an R interface to the Rust machine learning library linfa. + +The linfa crate is composed of several different modules, each implementing support for a given kind of model. For module `model1`, the code that "bridges" R and that linfa module is in `src/rust/src/model1.rs`. The `model.rs` module will supply a (non-exported) `fit()` and `predict()` R function for the given model type, with names `fit_model_a()` and `predict_model_a()`, where `model_a` is the name of the "model type" in tidymodels that corresponds to the kind of model implemented in the linfa module. (There's not always a 1-to-1 relationship between `model1` and `model_a`.) `fit_model_a()` takes in a vector of length `X * ncol(X)` that is `c(X)` for some model matrix `X`, and the Rust code in `model.rs` takes care of reshaping the vector into an array. `fit_model1()` is the lowest-level R interface to `model1`, is not exported, and is not intended for use by end-users. + +`fit_model_a()` is wrapped by an exported but `@keywords internal` function `.linfa_model_a()`, which has an "XY" interface (numeric model matrix X, vector outcome Y). `.linfa_model_a()` takes care of reshaping the data to the format expected by `fit_model_a()` and putting together a classed R object. While the XY interface is exported, it should not be considered stable. + +The "public" interface to rinfa models is via tidymodels (or, more specifically, parsnip). To use rinfa (and thus `.linfa_model_a()`) as a modeling engine, use the code `model_a(engine = "rinfa")`. Models can be fitted either with an XY interface (which will result in no `model.matrix()` overhead) or the formula interface. diff --git a/README.Rmd b/README.Rmd index d0d0ef2..0a0d300 100644 --- a/README.Rmd +++ b/README.Rmd @@ -93,3 +93,5 @@ dplyr::anti_join( ) %>% knitr::kable() ``` + +To read more about the design of rinfa, see [`R/README.md`](https://github.com/simonpcouch/rinfa/tree/main/R/README.md). diff --git a/README.md b/README.md index 0966bba..635ec57 100644 --- a/README.md +++ b/README.md @@ -78,3 +78,5 @@ in the following table: | multinom_reg | linfa | classification | | naive_Bayes | linfa | classification | | svm_linear | linfa | classification | + +To read more about the design of rinfa, see [`R/README.md`](https://github.com/simonpcouch/rinfa/tree/main/R/README.md).