Enhance Thesis with Normalization References #12

New Issue

2024-08-24T17:54:24Z

nuluh commented

2024-08-24 17:54:24 +00:00

(Migrated from github.com)

Problem Statement

The current thesis lacks a detailed discussion on normalization techniques, which are crucial for improving the performance and accuracy of Support Vector Machines (SVM). Specifically, when dealing with vibration data in damage localization prediction, the variations in feature scale can significantly impact the model's ability to learn effectively. Incorporating references to normalization methods will not only strengthen the theoretical foundation but also align with best practices in the field.

Context from Research

According to Chapter 2, Section 2.2 of Hsu et al.'s "A Practical Guide to Support Vector Classification," scaling (often referred to as normalization) plays a critical role in SVM performance. The authors emphasize that feature scaling ensures all input features contribute equally to the decision function. Without scaling, features with larger numeric ranges could dominate the learning process, leading to suboptimal decision boundaries.

The section highlights two main types of scaling:

Min-Max Scaling: Transforms features to a fixed range, often [0, 1], which preserves the relationships between feature values.
Z-Score Normalization: Adjusts features based on their mean and standard deviation, resulting in a mean of 0 and a standard deviation of 1.

The choice of scaling method can depend on the nature of the features and the specific problem domain.

Suggested Writings

In the revised thesis, include the following text in the section discussing feature engineering:

### Feature Scaling in Damage Localization Prediction

In the context of damage localization using SVM, it is essential to preprocess the input features through scaling. As highlighted by Hsu et al. (2003), scaling helps in mitigating the dominance of features with larger numeric ranges, thereby allowing the SVM to establish more balanced and effective decision boundaries.

For this thesis, two scaling techniques are considered:

- **Min-Max Scaling**: This method scales the input features to a fixed range, typically [0, 1]. It is particularly useful when the input features have a known range, and the relationships between feature values need to be preserved.

- **Z-Score Normalization**: This approach scales the features based on their mean and standard deviation, ensuring a normalized distribution with a mean of 0 and a standard deviation of 1. This technique is effective when dealing with features that may have different units or varying statistical distributions.

Given the nature of vibration data in structural health monitoring, where the magnitude of sensor readings can vary significantly, Z-Score Normalization is particularly suitable. It allows the SVM model to treat each feature with equal importance, improving the overall accuracy of damage localization predictions.

This addition will demonstrate an understanding of the importance of feature scaling in SVM and reinforce the technical depth of the thesis.

#11 ## Problem Statement The current thesis lacks a detailed discussion on normalization techniques, which are crucial for improving the performance and accuracy of Support Vector Machines (SVM). Specifically, when dealing with vibration data in damage localization prediction, the variations in feature scale can significantly impact the model's ability to learn effectively. Incorporating references to normalization methods will not only strengthen the theoretical foundation but also align with best practices in the field. ## Context from Research According to Chapter 2, Section 2.2 of Hsu et al.'s ["A Practical Guide to Support Vector Classification,"](https://www.semanticscholar.org/paper/A-Practical-Guide-to-Support-Vector-Classification-Hsu-Chang/8b9cd90af0631bbed04c0718230f0faed1eca209) **scaling** (often referred to as normalization) plays a critical role in SVM performance. The authors emphasize that feature scaling ensures all input features contribute equally to the decision function. Without scaling, features with larger numeric ranges could dominate the learning process, leading to suboptimal decision boundaries. The section highlights two main types of scaling: 1. **Min-Max Scaling**: Transforms features to a fixed range, often [0, 1], which preserves the relationships between feature values. 2. **Z-Score Normalization**: Adjusts features based on their mean and standard deviation, resulting in a mean of 0 and a standard deviation of 1. The choice of scaling method can depend on the nature of the features and the specific problem domain. ## Suggested Writings In the revised thesis, include the following text in the section discussing feature engineering: ```markdown ### Feature Scaling in Damage Localization Prediction In the context of damage localization using SVM, it is essential to preprocess the input features through scaling. As highlighted by Hsu et al. (2003), scaling helps in mitigating the dominance of features with larger numeric ranges, thereby allowing the SVM to establish more balanced and effective decision boundaries. For this thesis, two scaling techniques are considered: - **Min-Max Scaling**: This method scales the input features to a fixed range, typically [0, 1]. It is particularly useful when the input features have a known range, and the relationships between feature values need to be preserved. - **Z-Score Normalization**: This approach scales the features based on their mean and standard deviation, ensuring a normalized distribution with a mean of 0 and a standard deviation of 1. This technique is effective when dealing with features that may have different units or varying statistical distributions. Given the nature of vibration data in structural health monitoring, where the magnitude of sensor readings can vary significantly, Z-Score Normalization is particularly suitable. It allows the SVM model to treat each feature with equal importance, improving the overall accuracy of damage localization predictions. ``` This addition will demonstrate an understanding of the importance of feature scaling in SVM and reinforce the technical depth of the thesis.

Sign in to join this conversation.

Branches Tags

main

dev

feature/chapter-2-literature-review

feature/chapter-4-results

feature/chapter-3-methodology-steps

exp/74-exp-cross-dataset-validation

exp/74-exp-cross-dataset-validation-b2bf1b0

feat/103-feat-inference-function

feature/101-feat-time-elapsed-for-training-and-inference

feature/99-exp-alternative-undamage-case-data

feat/90-feat-preserve-trained-model

latex/75-enhance-background-research

wuicace-2025

revert-92-latex/91-bug-expose-maketitle

latex/91-bug-expose-maketitle

latex/documentclass

latex/frontmatter

latex/bib

latex/methodology

latex/literature-review

latex/theoritical-foundation

latex/background

latex/68-feat-refactor-chapter-two

68-feat-refactor-chapter-two

latex/initial-template

59-feat-add-acknowledgement-page

57-feat-add-dynamic-page-style-for-chapter-page

latex/fix-table-of-contents-styling

56-bug-endorsementpage-error

latex/54-doc-summary-table-of-past-realted-research

feature/48-feat-refactor-stft-preprocessing-and-training-pipeline-into-importable-modules

40-feat-add-export-to-csv-method-for-dataprocessor-in-convertpy

43-bug-stft-csv-export-has-incorrect-shape-and-column-format

feature/38-feat-redesign-convertpy

feature/37-feat-add-data-processing-script-for-dataset-b-outside-training-data

stft

feature/19-qugs-data

feature/15-normalize-dataset-by-preprocess-relatives-value-between-two-acceloremeter-sensors

feature/automate-csv-file

revert-8-feature/csv-padding-naming

feature/5-create-fft-script

feature/10-add-labels-column-to-time-domain-feature-extraction-dataframe

feature/csv-padding-naming

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: nuluh/thesis#12