Files
thesis/latex/chapters/id/03_methodology/steps/preprocessing/data_augmentation.tex
2025-08-07 22:49:04 +00:00

25 lines
1.1 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
We now introduce a simple “dataaugmentation” logic across repeated tests as:
\[
\mathbf{c}_{j}^{(i)}
\;=\;
\Bigl[S_{0+j}^{(i)},\,S_{5+j}^{(i)},\,S_{10+j}^{(i)},\,S_{15+j}^{(i)},\,S_{20+j}^{(i)},\,S_{25+j}^{(i)}\Bigr]^{T}
\;\in\mathbb{R}^{6}\!,
\]
where \(S_{k}^{(i)}\) is the \(k\)th sensors timefrequency feature vector (after STFT+logscaling) from the \(i\)-th replicate of scenario \(j\).
For each fixed scenario \(j\), collect the five replicates into the set
\[
\mathcal{D}^{(j)}
=\bigl\{\mathbf{c}_{j}^{(1)},\,\mathbf{c}_{j}^{(2)},\,\mathbf{c}_{j}^{(3)},\,\mathbf{c}_{j}^{(4)},\,\mathbf{c}_{j}^{(5)}\bigr\},
\]
so \(|\mathcal{D}^{(j)}|=5\). Across all six scenarios, the total augmented dataset is
\[
\mathcal{D}
=\bigcup_{j=0}^{5}\mathcal{D}^{(j)}
=\bigl\{\mathbf{c}_{j}^{(i)}: j=0,\dots,5,\;i=1,\dots,5\bigr\},
\]
with \(\lvert\mathcal{D}\rvert = 6 \times 5 = 30\) samples.
Each \(\mathbf{c}_{j}^{(i)}\) hence represents one ``columnbased damage sample,
and the collection \(\mathcal{D}\) serves as the input set for subsequent classification.