lightgbm_practice

Python code in this repo are Python3.9.12, running pyfile using:

pipenv run python xxx.py

Usage of lgb.Dataset()

data = np.random.rand(500, 10)  # 500 entities, each contains 10 features
label = np.random.randint(2, size=500)  # binary target

Specific feature names and categorical features:

train_data = lgb.Dataset(data, label=label, feature_name=['c1', 'c2', 'c3'], categorical_feature=['c3'])

LightGBM can use categorical features as input directly. It doesn’t need to convert to one-hot encoding, and is much faster than one-hot encoding (about 8x speed-up).

Note: You should convert your categorical features to int type before you construct Dataset.

Weights can be set when needed:

w = np.random.rand(500, )
train_data = lgb.Dataset(data, label=label, weight=w)

or

train_data = lgb.Dataset(data, label=label)
w = np.random.rand(500, )
train_data.set_weight(w)

Ref

https://lightgbm.readthedocs.io/en/v3.3.2/

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
lgb_Dataset_Usage.py		lgb_Dataset_Usage.py
lightgbm.pdf		lightgbm.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lightgbm_practice

Usage of lgb.Dataset()

Specific feature names and categorical features:

Weights can be set when needed:

Ref

About

Releases

Packages

Languages

License

xhqing/lightgbm_practice

Folders and files

Latest commit

History

Repository files navigation

lightgbm_practice

Usage of lgb.Dataset()

Specific feature names and categorical features:

Weights can be set when needed:

Ref

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages