Commit Graph

11 Commits

Author SHA1 Message Date
Rifqi D. Panuluh
d151062115 Add Working Milestone with Initial Results and Model Inference (#82)
* wip: add function to create stratified train-test split from STFT data

* feat(src): implement working function for dataset B to create ready data from STFT files stft_files and add setup.py for package configuration

* feat(notebook): Update variable names for clarity, remove unused imports, and streamline data processing. Implement data concatenation using pandas concat for efficiency. Add validation steps for Dataset B and improve model training consistency across sensors.

* fix(.gitignore): add rule to ignore egg-info directories and ensure proper formatting

* docs(README): add instructions for running stft.ipynb notebook

* feat(notebook): Add evaluation metrics and confusion matrix visualizations for model predictions on Dataset B. Remove commented-out code and integrate data preparation using create_ready_data function.

---------

Co-authored-by: nuluh <dam.ar@outlook.com>
2025-05-24 01:30:10 +07:00
nuluh
a2e339a0a0 feat: Implement STFT verification for individual test runs against aggregated data 2024-12-13 16:30:06 +07:00
nuluh
2decff0cfb Closes #24
feat(stft): Implement STFT processing for vibration data with multiprocessing support to include all the data for training process instead of just using `TEST1` only
2024-12-13 16:29:08 +07:00
nuluh
9618714d3c feat: Prepare all damage cases vibration record data to be merged inside two variables "signal_sensor1" and "signal_sensor2". Closes #23 2024-10-19 15:32:05 +07:00
nuluh
2f54e91197 feat: Add absolute value option to time feature extraction 2024-09-03 15:39:44 +07:00
nuluh
57c0e03a4f docs(script): Update time-domain feature extraction to skip header row separator char info 2024-08-20 12:52:48 +07:00
nuluh
8ab934fe1c feat(features): refactor feature extraction to handle multiple files and directories
- Modify `build_features` function to support iterative processing across nested directories, enhancing the system's ability to handle larger datasets and varied input structures.
- Replace direct usage of `FeatureExtractor` class with `ExtractTimeFeatures` function, which now acts as a wrapper to include this class, facilitating streamlined integration and maintenance of feature extraction processes.
- Implement `extract_numbers` function using regex to parse filenames and extract numeric identifiers, used for labels when training with SVM
- Switch output from `.npz` to `.csv` format in `build_features`, offering better compatibility with data analysis tools and readability.
- Update documentation and comments within the code to reflect changes in functionality and usage of the new feature extraction setup.

Closes #4
2024-08-20 12:52:06 +07:00
nuluh
55db5709a9 refactor(script): Add time-domain feature extraction functionality called ExtractTimeFeatures function returning features in {dictionary} that later called in build_features.py. This function will be called for each individual .csv. Each returning value later appended in build_features.py.
This function approach rather than just assigning class ensure the flexibility and enhance maintainability.
2024-08-19 13:20:14 +07:00
nuluh
d0db65011d style 2024-08-17 11:39:46 +07:00
nuluh
a401d620eb feat(features): integrate time-domain feature extraction into data pipeline
- Implement FeatureExtractor class in time_domain_features.py for calculating statistical features from dataset columns.
- Create build_features.py script to automate feature extraction from processed data and save results in a structured format.
- Adjust build_features.py to read processed data, utilize FeatureExtractor, and save feature matrix.

This update supports enhanced analysis capabilities within the thesis-project structure, allowing for more sophisticated data processing and model training stages.

Closes #1
2024-08-12 19:45:19 +07:00
nuluh
7d39176e27 feat: Add initial time domain feature extraction class
The code changes add a new file `time_domain_features.py` that contains a `FeatureExtractor` class. This class calculates various time domain features for a given dataset. The features include mean, max, peak, peak-to-peak, RMS, variance, standard deviation, power, crest factor, form factor, pulse indicator, margin, kurtosis, and skewness.

The class takes a file path as input and reads the data from a CSV file. It assumes the data to analyze is in the first column. The calculated features are stored in a dictionary.

The commit message suggests that the purpose of the changes is to add a new class for time domain feature extraction.
2024-08-12 12:37:55 +07:00