[FEAT] Include Undamaged Node Classification #53

New Issue

2025-05-06T05:06:27Z

nuluh commented

2025-05-06 05:06:27 +00:00

(Migrated from github.com)

Problem Statement

The current machine learning model is only trained to identify and classify damaged nodes among different damage locations. However, it lacks the capability to properly classify undamaged nodes, which is a significant limitation. When the model is presented with data from an undamaged structure, it will incorrectly classify it as one of the damaged categories since it has no concept of an "undamaged" state.

Proposed Solution

Expand the machine learning model to include undamaged node classification by:

Including undamaged (healthy) structure data in the training dataset
Adding "undamaged" as a distinct class label in the classification scheme
Modifying the training pipeline to incorporate this new class
Retraining the model with the expanded dataset and label set
Evaluating the model's performance on both damaged and undamaged classifications

Alternatives Considered

Two-stage classifier: First classify damaged vs. undamaged, then identify damage location if damaged. This adds complexity but might improve specificity.
Anomaly detection approach: Train on only undamaged data and detect deviations. This would change the fundamental approach from classification to anomaly detection.
Confidence threshold: Keep current model but establish a confidence threshold below which predictions are considered "undamaged." This is simpler but less reliable.

Component

Python Source Code

Priority

Critical (blocks progress)

Implementation Ideas

No response

Expected Benefits

Comprehensive classification: Model will correctly identify both damaged and undamaged structures
Reduced false positives: Fewer healthy structures will be incorrectly classified as damaged
Complete SHM system: More practical for real-world structural health monitoring applications
More robust evaluation: Performance metrics will reflect real-world usage scenarios
Better thesis contribution: Demonstrates a complete solution rather than partial capability

Additional Context

This enhancement addresses a fundamental limitation in the current model. For a structural health monitoring thesis, the ability to distinguish between damaged and undamaged states is essential for practical application. The confusion matrix and performance metrics should be updated to reflect this expanded classification task.
Consider creating a new experiment notebook specifically focused on evaluating the model's performance in distinguishing undamaged structures from various damage locations. This will provide valuable insights into the model's practical utility.

Key metrics to evaluate after implementation:

Overall accuracy
False positive rate (undamaged classified as damaged)
False negative rate (damaged classified as undamaged)
Confusion matrix including undamaged class
Class-specific F1 scores

### Problem Statement The current machine learning model is only trained to identify and classify damaged nodes among different damage locations. However, it lacks the capability to properly classify undamaged nodes, which is a significant limitation. When the model is presented with data from an undamaged structure, it will incorrectly classify it as one of the damaged categories since it has no concept of an "undamaged" state. ### Proposed Solution Expand the machine learning model to include undamaged node classification by: 1. Including undamaged (healthy) structure data in the training dataset 2. Adding "undamaged" as a distinct class label in the classification scheme 3. Modifying the training pipeline to incorporate this new class 4. Retraining the model with the expanded dataset and label set 5. Evaluating the model's performance on both damaged and undamaged classifications ### Alternatives Considered 1. Two-stage classifier: First classify damaged vs. undamaged, then identify damage location if damaged. This adds complexity but might improve specificity. 2. Anomaly detection approach: Train on only undamaged data and detect deviations. This would change the fundamental approach from classification to anomaly detection. 3. Confidence threshold: Keep current model but establish a confidence threshold below which predictions are considered "undamaged." This is simpler but less reliable. ### Component Python Source Code ### Priority Critical (blocks progress) ### Implementation Ideas _No response_ ### Expected Benefits 1. Comprehensive classification: Model will correctly identify both damaged and undamaged structures 2. Reduced false positives: Fewer healthy structures will be incorrectly classified as damaged 3. Complete SHM system: More practical for real-world structural health monitoring applications 4. More robust evaluation: Performance metrics will reflect real-world usage scenarios 5. Better thesis contribution: Demonstrates a complete solution rather than partial capability ### Additional Context This enhancement addresses a fundamental limitation in the current model. For a structural health monitoring thesis, the ability to distinguish between damaged and undamaged states is essential for practical application. The confusion matrix and performance metrics should be updated to reflect this expanded classification task. Consider creating a new experiment notebook specifically focused on evaluating the model's performance in distinguishing undamaged structures from various damage locations. This will provide valuable insights into the model's practical utility. Key metrics to evaluate after implementation: - Overall accuracy - False positive rate (undamaged classified as damaged) - False negative rate (damaged classified as undamaged) - Confusion matrix including undamaged class - Class-specific F1 scores

nuluh commented

2025-05-06 05:13:38 +00:00

(Migrated from github.com)

Should I only use one file (zzzAU.TXT) or use all the file based on column sensors damaged which used and which being leaved out as undamaged data?

Sign in to join this conversation.

Branches Tags

main

dev

feature/chapter-2-literature-review

feature/chapter-4-results

feature/chapter-3-methodology-steps

exp/74-exp-cross-dataset-validation

exp/74-exp-cross-dataset-validation-b2bf1b0

feat/103-feat-inference-function

feature/101-feat-time-elapsed-for-training-and-inference

feature/99-exp-alternative-undamage-case-data

feat/90-feat-preserve-trained-model

latex/75-enhance-background-research

wuicace-2025

revert-92-latex/91-bug-expose-maketitle

latex/91-bug-expose-maketitle

latex/documentclass

latex/frontmatter

latex/bib

latex/methodology

latex/literature-review

latex/theoritical-foundation

latex/background

latex/68-feat-refactor-chapter-two

68-feat-refactor-chapter-two

latex/initial-template

59-feat-add-acknowledgement-page

57-feat-add-dynamic-page-style-for-chapter-page

latex/fix-table-of-contents-styling

56-bug-endorsementpage-error

latex/54-doc-summary-table-of-past-realted-research

feature/48-feat-refactor-stft-preprocessing-and-training-pipeline-into-importable-modules

40-feat-add-export-to-csv-method-for-dataprocessor-in-convertpy

43-bug-stft-csv-export-has-incorrect-shape-and-column-format

feature/38-feat-redesign-convertpy

feature/37-feat-add-data-processing-script-for-dataset-b-outside-training-data

stft

feature/19-qugs-data

feature/15-normalize-dataset-by-preprocess-relatives-value-between-two-acceloremeter-sensors

feature/automate-csv-file

revert-8-feature/csv-padding-naming

feature/5-create-fft-script

feature/10-add-labels-column-to-time-domain-feature-extraction-dataframe

feature/csv-padding-naming

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: nuluh/thesis#53