Skip to content

Commit

Permalink
DOC Add note on bias induced by dropping categories in OneHotE… (scik…
Browse files Browse the repository at this point in the history
  • Loading branch information
ogrisel authored Mar 13, 2020
1 parent bd36ada commit 6ee1597
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions sklearn/preprocessing/_encoders.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,10 @@ class OneHotEncoder(_BaseEncoder):
features cause problems, such as when feeding the resulting data
into a neural network or an unregularized regression.
However, dropping one category breaks the symmetry of the original
representation and can therefore induce a bias in downstream models,
for instance for penalized linear classification or regression models.
- None : retain all features (the default).
- 'first' : drop the first category in each feature. If only one
category is present, the feature will be dropped entirely.
Expand Down

0 comments on commit 6ee1597

Please sign in to comment.