Skip to content

Commit

Permalink
section updates
Browse files Browse the repository at this point in the history
  • Loading branch information
topepo committed Mar 27, 2019
1 parent 4d165a6 commit 7616d9b
Show file tree
Hide file tree
Showing 5 changed files with 33 additions and 20 deletions.
4 changes: 2 additions & 2 deletions bookdown/18-Filters.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Contents

- [Univariate Filters](#filter)
- [Basic Syntax](#syntax)
- [The Example](#example)
- [The Example](#fexample)

<div id="filter"></div>

Expand Down Expand Up @@ -77,7 +77,7 @@ calculate model performance on held-out samples. The `pred` function is used to
<div id="example">
</div>

## The Example
## The Example {#fexample}

Returning to the example from (Friedman, 1991), we can fit another random forest model with the predictors pre-filtered using the generalized additive model approach described previously.

Expand Down
19 changes: 14 additions & 5 deletions bookdown/19-RFE.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ Contents
- [Feature Selection Using Search Algorithms](#search)
- [Resampling and External Validation](#resamp)
- [Recursive Feature Elimination via `caret`](#rfe)
- [An Example](#rfeexample)
- [Helper Functions](#rfehelpers)
- [The Example](#rfeexample2)
- [Using a Recipe](##rferecipes)

## Backwards Selection

Expand Down Expand Up @@ -62,9 +66,9 @@ In [`caret`](http://cran.r-project.org/web/packages/caret/index.html), Algorithm

For a specific model, a set of functions must be specified in `rfeControl$functions`. Sections below has descriptions of these sub-functions. There are a number of pre-defined sets of functions for several models, including: linear regression (in the object `lmFuncs`), random forests (`rfFuncs`), naive Bayes (`nbFuncs`), bagged trees (`treebagFuncs`) and functions that can be used with [`caret`](http://cran.r-project.org/web/packages/caret/index.html)'s `train` function (`caretFuncs`). The latter is useful if the model has tuning parameters that must be determined at each iteration.

<div id="example"></div>
<div id="rfeexample"></div>

## An Example
## An Example {#rfeexample}

```{r rfe_load_lib}
library(caret)
Expand Down Expand Up @@ -141,7 +145,9 @@ plot(lmProfile, type = c("g", "o"))

Also the resampling results are stored in the sub-object `lmProfile$resample` and can be used with several lattice functions. Univariate lattice functions (`densityplot`, `histogram`) can be used to plot the resampling distribution while bivariate functions (`xyplot`, `stripplot`) can be used to plot the distributions for different subset sizes. In the latter case, the option `returnResamp`` = "all"` in `rfeControl` can be used to save all the resampling results. Example images are shown below for the random forest model.

## Helper Functions
<div id="rfehelpers"></div>

## Helper Functions {#rfehelpers}

To use feature elimination for an arbitrary model, a set of functions must be passed to `rfe` for each of the steps in Algorithm 2.

Expand Down Expand Up @@ -304,7 +310,9 @@ rfRFE$selectVar

Note that if the predictor rankings are recomputed at each iteration (line 2.11) the user will need to write their own selection function to use the other ranks.

## The Example
<div id="rfeexample2"></div>

## The Example {#rfeexample2}

For random forest, we fit the same series of model sizes as the linear model. The option to save all the resampling results across subset sizes was changed for this model and are used to show the lattice plot function capabilities in the figures below.

Expand Down Expand Up @@ -341,8 +349,9 @@ print(plot1, split=c(1,1,1,2), more=TRUE)
print(plot2, split=c(1,2,1,2))
```

<div id="rferecipes"></div>

## Using a Recipe
## Using a Recipe {#rferecipes}

A recipe can be used to specify the model terms and any preprocessing that may be needed. Instead of using

Expand Down
6 changes: 3 additions & 3 deletions bookdown/20-GA.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Contents
- [Example](#gaexample)
- [Customizing the Search](#custom)
- [The Example Revisited](#example2)
- [Using Recipes](#recipes)
- [Using Recipes](#garecipes)

<div id="ga"></div>

Expand Down Expand Up @@ -94,7 +94,7 @@ The GA implementation in [`caret`](http://cran.r-project.org/web/packages/caret/

<div id="gaexample"></div>

## Example
## Genetic Algorithm Example {#gaexample}


Using the example from the [previous page](recursive-feature-elimination.html#example) where there are five real predictors and 40 noise predictors:
Expand Down Expand Up @@ -358,7 +358,7 @@ plot(rf_ga_d) + theme_bw()
The final GA found `r I(length(rf_ga_d$optVariables))` that were selected: `r paste(rf_ga_d$optVariables, sep = "", collapse = ", ")`. During resampling, the average number of predictors selected was `r round(mean(unlist(lapply(rf_ga_d$resampled_vars, length))), 1)`, indicating that the penalty on the number of predictors was effective.


<div id="recipes"></div>
<div id="garecipes"></div>

## Using Recipes

Expand Down
10 changes: 5 additions & 5 deletions bookdown/21-SA.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ Contents
- [Simulated Annealing](#sa)
- [Internal and External Performance Estimates](#performance)
- [Basic Syntax](#syntax)
- [Example](#example)
- [Example](#saexample)
- [Customizing the Search](#custom)
- [Using Recipes](#recipes)
- [Using Recipes](#sarecipes)

<div id="sa"></div>

Expand Down Expand Up @@ -87,9 +87,9 @@ Some important options to `safsControl` are:

There are a few built-in sets of functions to use with `safs`: `caretSA`, `rfSA`, and `treebagSA`. The first is a simple interface to `train`. When using this, as shown above, arguments can be passed to `train` using the `...` structure and the resampling estimates of performance can be used as the internal fitness value. The functions provided by `rfSA` and `treebagSA` avoid using `train` and their internal estimates of fitness come from using the out-of-bag estimates generated from the model.

<div id="example"></div>
<div id="saexample"></div>

## Example
## Simulated Annealing Example {#saexample}


Using the example from the [previous page](recursive-feature-elimination.html#example) where there are five real predictors and 40 noise predictors.
Expand Down Expand Up @@ -279,7 +279,7 @@ ggplot(grid, aes(x = iter, y = prob, color = Difference)) +
While this is the default, any user-written function can be used to assign probabilities.


<div id="recipes"></div>
<div id="sarecipes"></div>

## Using Recipes

Expand Down
14 changes: 9 additions & 5 deletions bookdown/make_files.R
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
library(bookdown)
library(parallel)
library(doParallel)
cl <- makeForkCluster(parallel::detectCores(logical = TRUE))
registerDoParallel(cl)
# library(parallel)
# library(doParallel)
# cl <- makeForkCluster(parallel::detectCores(logical = TRUE))
# registerDoParallel(cl)

library(doMC)
registerDoMC(cores = parallel::detectCores(logical = TRUE))

render_book("index.Rmd", "bookdown::gitbook")

if(!interactive()) q("no")
if (!interactive())
q("no")

0 comments on commit 7616d9b

Please sign in to comment.