This is the code repository for Active Machine Learning with Python, published by Packt.
Refine and elevate data quality over quantity with active learning
Building accurate machine learning models requires quality data—lots of it. However, for most teams, assembling massive datasets is time-consuming, expensive, or downright impossible. Led by Margaux Masson-Forsythe, a seasoned ML engineer and advocate for surgical data science and climate AI advancements, this hands-on guide to active machine learning demonstrates how to train robust models with just a fraction of the data using Python's powerful active learning tools.
This book covers the following exciting features:
- Master the fundamentals of active machine learning
- Understand query strategies for optimal model training with minimal data
- Tackle class imbalance, concept drift, and other data challenges
- Evaluate and analyze active learning model performance
- Integrate active learning libraries into workflows effectively
- Optimize workflows for human labelers
- Explore the finest active learning tools available today
If you feel this book is for you, get your copy today!
All of the code is organized into folders.
The code will look like the following:
y_true = np.array(small_dataset['label'])
x_true = np.array(small_dataset['text'])
Following is what you need for this book: Ideal for data scientists and ML engineers aiming to maximize model performance while minimizing costly data labeling, this book is your guide to optimizing ML workflows and prioritizing quality over quantity. Whether you’re a technical practitioner or team lead, you’ll benefit from the proven methods presented in this book to slash data requirements and iterate faster. Basic Python proficiency and familiarity with machine learning concepts such as datasets and convolutional neural networks is all you need to get started.
With the following software and hardware list you can run all code files present in the book (Chapter 1-7).
Chapter | Software required | OS required |
---|---|---|
1-7 | Python 3.10.12+ | Any OS |
1-7 | Jupyter or Google Colab notebook | Any OS |
Margaux Masson-Forsythe is a skilled machine learning engineer and advocate for advancements in surgical data science and climate AI. As the director of machine learning at Surgical Data Science Collective, she builds computer vision models to detect surgical tools in videos and track procedural motions. Masson-Forsythe manages a multidisciplinary team and oversees model implementation, data pipelines, infrastructure, and product delivery. With a background in computer science and expertise in machine learning, computer vision, and geospatial analytics, she has worked on projects related to reforestation, deforestation monitoring, and crop yield prediction.