[FEAT] Implement k-fold validation #102

New Issue

2025-07-27T22:08:37Z

nuluh commented

2025-07-27 22:08:37 +00:00

(Migrated from github.com)

Problem Statement

Currently, model evaluation may be limited due to the use of a single train/test split, which can lead to overfitting or unreliable performance estimates. There is a need for a more robust evaluation method to assess model generalization.

Proposed Solution

Integrate k-fold cross-validation into the model evaluation pipeline, allowing for model performance to be averaged over multiple splits. This would provide a more reliable estimate of model accuracy and reduce variance due to data partitioning. Additionally, implement a visualization plot (e.g., boxplot, line plot, or bar chart) to show model performance metrics (such as accuracy or loss) across each fold, highlighting the distribution and variance during the cross-validation process.

Alternatives Considered

Alternatives include leave-one-out cross-validation or a simple train/test split, but k-fold offers a good balance between computational efficiency and robustness.

Component

Jupyter Notebook

Priority

Critical (blocks progress)

Implementation Ideas

Use scikit-learn's KFold or StratifiedKFold utilities
Refactor current evaluation code to loop over k folds
Aggregate results and report mean/variance of model metrics
Make the number of folds configurable by the user
Use matplotlib or seaborn to visualize the performance (e.g., accuracy, F1 score) for each fold, such as with boxplots or line plots for interpretability.

Expected Benefits

This will provide a more reliable and generalizable estimate of model performance, making the thesis results stronger and more credible. Visualization will help to quickly interpret how stable the model is across folds and identify any outlier behavior.

Additional Context

K-fold validation is a standard ML practice and would be beneficial for comparing different models or feature sets. Visualizing the results will add clarity to the thesis and strengthen the analysis.

### Problem Statement Currently, model evaluation may be limited due to the use of a single train/test split, which can lead to overfitting or unreliable performance estimates. There is a need for a more robust evaluation method to assess model generalization. ### Proposed Solution Integrate k-fold cross-validation into the model evaluation pipeline, allowing for model performance to be averaged over multiple splits. This would provide a more reliable estimate of model accuracy and reduce variance due to data partitioning. Additionally, implement a visualization plot (e.g., boxplot, line plot, or bar chart) to show model performance metrics (such as accuracy or loss) across each fold, highlighting the distribution and variance during the cross-validation process. ### Alternatives Considered Alternatives include leave-one-out cross-validation or a simple train/test split, but k-fold offers a good balance between computational efficiency and robustness. ### Component Jupyter Notebook ### Priority Critical (blocks progress) ### Implementation Ideas - Use scikit-learn's KFold or StratifiedKFold utilities - Refactor current evaluation code to loop over k folds - Aggregate results and report mean/variance of model metrics - Make the number of folds configurable by the user - Use matplotlib or seaborn to visualize the performance (e.g., accuracy, F1 score) for each fold, such as with boxplots or line plots for interpretability. ### Expected Benefits This will provide a more reliable and generalizable estimate of model performance, making the thesis results stronger and more credible. Visualization will help to quickly interpret how stable the model is across folds and identify any outlier behavior. ### Additional Context K-fold validation is a standard ML practice and would be beneficial for comparing different models or feature sets. Visualizing the results will add clarity to the thesis and strengthen the analysis.

Sign in to join this conversation.

Branches Tags

main

dev

feature/chapter-2-literature-review

feature/chapter-4-results

feature/chapter-3-methodology-steps

exp/74-exp-cross-dataset-validation

exp/74-exp-cross-dataset-validation-b2bf1b0

feat/103-feat-inference-function

feature/101-feat-time-elapsed-for-training-and-inference

feature/99-exp-alternative-undamage-case-data

feat/90-feat-preserve-trained-model

latex/75-enhance-background-research

wuicace-2025

revert-92-latex/91-bug-expose-maketitle

latex/91-bug-expose-maketitle

latex/documentclass

latex/frontmatter

latex/bib

latex/methodology

latex/literature-review

latex/theoritical-foundation

latex/background

latex/68-feat-refactor-chapter-two

68-feat-refactor-chapter-two

latex/initial-template

59-feat-add-acknowledgement-page

57-feat-add-dynamic-page-style-for-chapter-page

latex/fix-table-of-contents-styling

56-bug-endorsementpage-error

latex/54-doc-summary-table-of-past-realted-research

feature/48-feat-refactor-stft-preprocessing-and-training-pipeline-into-importable-modules

40-feat-add-export-to-csv-method-for-dataprocessor-in-convertpy

43-bug-stft-csv-export-has-incorrect-shape-and-column-format

feature/38-feat-redesign-convertpy

feature/37-feat-add-data-processing-script-for-dataset-b-outside-training-data

stft

feature/19-qugs-data

feature/15-normalize-dataset-by-preprocess-relatives-value-between-two-acceloremeter-sensors

feature/automate-csv-file

revert-8-feature/csv-padding-naming

feature/5-create-fft-script

feature/10-add-labels-column-to-time-domain-feature-extraction-dataframe

feature/csv-padding-naming

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: nuluh/thesis#102