Accuracy (TBD): Measures the proportion of correct predictions out of all predictions made.
Precision (TBD) and Recall (TBD): Precision measures the accuracy of the positive predictions, while recall measures the ability of the model to capture all positive instances.
F1 Score (TBD): Harmonic mean of precision and recall, providing a balance between the two.
Mean Squared Error (MSE) (TBD): Measures the average squared difference between predicted and actual values.
Mean Absolute Error (MAE) (TBD): Measures the average absolute difference between predicted and actual values.
Cumulative Reward (TBD): Total reward accumulated over episodes, reflecting the model's ability to achieve long-term goals.
Episode Length (TBD): Average length of episodes, indicating how quickly the model reaches its objectives.
Success Rate (TBD): Proportion of episodes where the model successfully completes the task or achieves the goal.
Learning Rate (TBD): The rate at which the model improves its performance over time, often visualized as a learning curve.
Exploration vs. Exploitation Balance (TBD): Measures the balance between the model's exploration of new strategies and exploitation of known strategies.
Human Feedback Utilization Rate (TBD): Proportion of actions adjusted based on human feedback.
Improvement Rate from Human Feedback (TBD): Improvement in model performance attributable to human feedback.
Feedback Accuracy (TBD): Accuracy of human feedback in guiding the model towards better performance.
Training Time (TBD): Total time taken to train the model.
Inference Time (TBD): Time taken to make predictions during gameplay.
Resource Utilization (TBD CPU, TBD GPU): CPU/GPU usage, memory consumption, and other computational resource metrics during training and inference.
Frame Extraction Rate (TBD fps): Number of frames extracted per second from gameplay videos.
Annotation Speed (TBD hours per video): Time taken to annotate frames, both manually and automatically.
Data Augmentation Impact (TBD improvement): Effect of data augmentation techniques on model performance.
Model Scalability (TBD games): Ability of the model to handle increasing amounts of data and complexity in gameplay scenarios.
Benchmark Comparison (Top TBD%): Comparison of your model's performance against existing benchmarks or models.
User Engagement (TBD% feedback incorporation): Metrics on how human feedback influenced the model and the user's interaction with it.
Visualization of Results (TBD graphs): Graphs and plots showing training progress, reward accumulation, and other metrics.
Code Quality (Well-documented): Documentation, readability, and maintainability of the codebase.
Reproducibility (TBD% reproducible): Ability of others to reproduce your results using your code and documentation.