GitHub - vishalbelsare/compose at 811d0844a1986ff145bd75b25904b1e51370db10

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.github/workflows		.github/workflows
composeml		composeml
docs		docs
.codecov.yml		.codecov.yml
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
release.md		release.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
test-requirements.txt		test-requirements.txt

Repository files navigation

"Build better training examples in a fraction of the time."

Compose is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. An end user defines an outcome of interest by writing a labeling function, then runs a search to automatically extract training examples from historical data. Its result is then provided to Featuretools for automated feature engineering and subsequently to EvalML for automated machine learning. The workflow of an applied machine learning engineer then becomes:

By automating the early stage of the machine learning pipeline, our end user can easily define a task and solve it. See the documentation for more information.

Install

Compose is available on PyPI and Conda-forge for Python 3.6 or later.

`pip`

To install from PyPI, run the command:

pip install composeml

`conda`

To install from Conda-forge, run the command:

conda install -c conda-forge composeml

Example

Will a customer spend more than 300 in the next hour of transactions?

In this example, we automatically generate new training examples from a historical dataset of transactions.

import composeml as cp
df = cp.demos.load_transactions()
df = df[df.columns[:7]]
df.head()

transaction_id	session_id	transaction_time	product_id	amount	customer_id	device
298	1	2014-01-01 00:00:00	5	127.64	2	desktop
10	1	2014-01-01 00:09:45	5	57.39	2	desktop
495	1	2014-01-01 00:14:05	5	69.45	2	desktop
460	10	2014-01-01 02:33:50	5	123.19	2	tablet
302	10	2014-01-01 02:37:05	5	64.47	2	tablet

First, we represent the prediction problem with a labeling function and a label maker.

def total_spent(ds):
    return ds['amount'].sum()

label_maker = cp.LabelMaker(
    target_entity="customer_id",
    time_index="transaction_time",
    labeling_function=total_spent,
    window_size="1h",
)

Then, we run a search to automatically generate the training examples.

label_times = label_maker.search(
    df.sort_values('transaction_time'),
    num_examples_per_instance=2,
    minimum_data='2014-01-01',
    drop_empty=False,
    verbose=False,
)

label_times = label_times.threshold(300)
label_times.head()

customer_id	time	total_spent
1	2014-01-01 00:00:00	True
1	2014-01-01 01:00:00	True
2	2014-01-01 00:00:00	False
2	2014-01-01 01:00:00	False
3	2014-01-01 00:00:00	False

We now have labels that are ready to use in Featuretools to generate features.

Support

The Innovation Labs open source community is happy to provide support to users of Compose. Project support can be found in three places depending on the type of question:

For usage questions, use Stack Overflow with the composeml tag.
For bugs, issues, or feature requests start a Github issue.
For discussion regarding development on the core library, use Slack.

Citing Compose

Compose is built upon a newly defined part of the machine learning process — prediction engineering. If you use Compose, please consider citing this paper: James Max Kanter, Gillespie, Owen, Kalyan Veeramachaneni. Label, Segment,Featurize: a cross domain framework for prediction engineering. IEEE DSAA 2016.

BibTeX entry:

@inproceedings{kanter2016label,
  title={Label, segment, featurize: a cross domain framework for prediction engineering},
  author={Kanter, James Max and Gillespie, Owen and Veeramachaneni, Kalyan},
  booktitle={2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)},
  pages={430--439},
  year={2016},
  organization={IEEE}
}

Acknowledgements

The open source development has been supported in part by DARPA's Data driven discovery of models program (D3M).

Innovation Labs

Compose has been developed and open sourced by Innovation Labs. We developed Compose to enable flexible definition of the machine learning task. To see the other open source projects we're working on visit Innovation Labs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

`pip`

`conda`

Example

Support

Citing Compose

Acknowledgements

Innovation Labs

About

Releases

Packages

Languages

License

vishalbelsare/compose

Folders and files

Latest commit

History

Repository files navigation

Install

pip

conda

Example

Support

Citing Compose

Acknowledgements

Innovation Labs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`pip`

`conda`

Packages