This repository contains MATLAB scripts, Simulink models, and data files for process monitoring and fault detection using the KTB1 simulator. This repository is designed for developing and implementing machine learning models to classify operational states in biomanufacturing processes, specifically using surrogate modeling and fault detection techniques.
- MATLAB Scripts: Machine learning model implementations, data classification scripts, and dataset preparation.
- Simulink Models: System configurations for the KTB1 simulator, including control and fault-emulation models.
- Data Files: Datasets for normal and faulty operational conditions, as well as the combined dataset used for training and testing.
-
classifyData.m: Main script for running machine learning classification on the dataset. This script evaluates whether the operational data comes from a normal or faulty state using models trained on historical data.
-
CSTR.m: MATLAB script related to a Continuous Stirred Tank Reactor (CSTR) model, likely used for simulating and analyzing dynamic behaviors in biomanufacturing.
-
Dataset.m: Script for managing and pre-processing datasets. This script handles data loading, cleaning, and any transformations needed to make the data suitable for machine learning.
-
DT.m: Script implementing a Decision Tree (DT) classifier. This file contains code for setting up, training, and testing a decision tree model on the given datasets.
-
Faulty.m: Script for generating or managing faulty datasets. This script defines or introduces anomalies and faults in the data, simulating non-ideal operating conditions for model training.
-
KNN.m: Script for implementing a k-Nearest Neighbors (KNN) classifier. This code trains a KNN model and evaluates its performance in identifying faults versus normal operation.
-
KTB1_CL.slx: Simulink model for the closed-loop control configuration of the KTB1 system. This model is used to simulate system behavior under feedback control conditions.
-
KTB1_FE.slx: Simulink model for fault emulation (FE) in the KTB1 system. This model introduces faults or disturbances into the system, allowing for testing and validation of fault detection mechanisms.
-
LR.m: Script implementing a Logistic Regression (LR) classifier. This file trains and tests a logistic regression model on the combined dataset to classify normal and faulty data points.
-
Normal.m: Script for managing or generating normal operation datasets. This script processes data collected under ideal operating conditions, serving as a baseline for fault detection.
-
Tank_A.m and Tank_B.m: MATLAB scripts for simulating individual tanks in the process. These files likely define the parameters and behaviors of the tanks within the KTB1 model and may be referenced within larger simulation workflows.
-
OM_combined_dataset.mat: Combined dataset containing both normal and faulty operational data. This dataset is used to train and test machine learning models for fault classification.
-
OM_faulty.mat: Dataset containing data exclusively from faulty operations, providing a contrast to normal operations for model training.
-
OM_normal.mat: Dataset containing data exclusively from normal operations, representing baseline operational conditions.
-
Setup: Open MATLAB and ensure that all necessary files are in the working directory. Load the KTB1 Simulink models (
KTB1_CL.slx
andKTB1_FE.slx
) as needed for simulations. -
Data Generation:
- Run Normal.m to simulate and save data under normal conditions, stored as
OM_normal.mat
. - Run Faulty.m to simulate and save data under faulty conditions, stored as
OM_faulty.mat
. - Use Dataset.m to combine
OM_normal.mat
andOM_faulty.mat
into a single dataset,OM_combined_dataset.mat
, for machine learning training and testing.
- Run Normal.m to simulate and save data under normal conditions, stored as
-
Model Training and Classification:
- Choose a classification model script, such as LR.m (Logistic Regression), DT.m (Decision Tree), or KNN.m (k-Nearest Neighbors).
- Run the selected script to train the model on the combined dataset (
OM_combined_dataset.mat
). - Use classifyData.m to evaluate the model’s classification accuracy on normal versus faulty data.
-
Simulink Simulation:
- Open KTB1_CL.slx for closed-loop simulations.
- Open KTB1_FE.slx for fault emulation simulations.
- Use these models in tandem with MATLAB scripts for real-time monitoring and classification of process states.
- Each model script can be run independently to test various machine learning algorithms and assess classification accuracy.
- Ensure MATLAB paths are set correctly to access all files.
- Additional tuning may be required in model parameters within the Simulink environment for specific biomanufacturing processes.
Please feel free to submit issues, feature requests, or contributions. We welcome collaboration to improve the model and extend its applications in process monitoring and fault detection.
Contact Person: