feat(stft): Implement STFT processing for vibration data with multiprocessing support to include all the data for training process instead of just using `TEST1` only
Introduce a Python script for transforming QUGS 2D grid structure data into a simplified 1D beam format suitable for SVM-based damage detection. The script efficiently slices original CSV files into smaller, manageable sets, correlating specific damage scenarios with their corresponding sensor data. This change addresses the challenge of retaining critical damage localization information during the data conversion process, ensuring high-quality, relevant data for 1D analysis.
Closes#20
Implement extraction of 'labels' from directory names and append as a new column in the dataframe during feature extraction. Adapted from the existing `build_features.py` script to enhance data usability in supervised learning models within the Jupyter notebook environment.
Closes#10
This commit adds a new file, `.vscode/launch.json`, which contains the configuration for launching the Python debugger. The configuration includes the necessary attributes such as the debugger type, request type, program file, console type, and command-line arguments. This configuration allows developers to easily debug Python files in the integrated terminal.
- Modify `build_features` function to support iterative processing across nested directories, enhancing the system's ability to handle larger datasets and varied input structures.
- Replace direct usage of `FeatureExtractor` class with `ExtractTimeFeatures` function, which now acts as a wrapper to include this class, facilitating streamlined integration and maintenance of feature extraction processes.
- Implement `extract_numbers` function using regex to parse filenames and extract numeric identifiers, used for labels when training with SVM
- Switch output from `.npz` to `.csv` format in `build_features`, offering better compatibility with data analysis tools and readability.
- Update documentation and comments within the code to reflect changes in functionality and usage of the new feature extraction setup.
Closes#4
Update the README.md file in the data/processed directory to provide clearer instructions on how to load the data from the desired Dx_TESTy.csv file. This change enhances the usability of the data files for analysis.
- Create a Python script to generate CSV files in a structured folder hierarchy under `data/processed` with specific damage levels and tests.
- Add a `.gitignore` file to exclude CSV files from Git tracking, enhancing data privacy and reducing repository size.
- Include a `README.md` in the `data` directory to clearly document the directory structure, file content, and their intended use for clarity and better usability.
Closes#7
This commit adds a new script `start.sh` that automates the process of processing raw data, building features, and training a model. The script uses Python scripts from the `src` directory to perform these tasks. The processed data is saved in the `data/processed` directory, the feature matrix is saved in the `data/features` directory, and the trained model is saved in the `models` directory.
The purpose of these changes is to streamline the data processing and model training workflow, making it easier to reproduce and iterate on the results.
This commit adds code to the `03_feature_extraction.ipynb` notebook to print time-domain features. The features include mean, max, peak, peak-to-peak, RMS, variance, standard deviation, power, crest factor, form factor, pulse indicator, margin, kurtosis, and skewness. The features are calculated using the `FeatureExtractor` class and displayed in a pandas DataFrame.
This commit adds the "python.analysis.extraPaths" setting to the VSCode settings.json file. The setting includes the "./code/src/features" directory as an additional path for Python analysis. This change improves the analysis capabilities within the VSCode environment.
Closes#3
Introduce a new testing script that generates mockup data and applies the FeatureExtractor class to calculate and display features. This test script assists in verifying the functionality of the feature extraction methods with controlled input data.
- Implement FeatureExtractor class in time_domain_features.py for calculating statistical features from dataset columns.
- Create build_features.py script to automate feature extraction from processed data and save results in a structured format.
- Adjust build_features.py to read processed data, utilize FeatureExtractor, and save feature matrix.
This update supports enhanced analysis capabilities within the thesis-project structure, allowing for more sophisticated data processing and model training stages.
Closes#1
The code changes add a new file `time_domain_features.py` that contains a `FeatureExtractor` class. This class calculates various time domain features for a given dataset. The features include mean, max, peak, peak-to-peak, RMS, variance, standard deviation, power, crest factor, form factor, pulse indicator, margin, kurtosis, and skewness.
The class takes a file path as input and reads the data from a CSV file. It assumes the data to analyze is in the first column. The calculated features are stored in a dictionary.
The commit message suggests that the purpose of the changes is to add a new class for time domain feature extraction.