Commit Graph

35 Commits

Author SHA1 Message Date
nuluh
e347b63e6e fix(src): update damage base path and adjust STFT processing parameters 2025-10-16 12:32:22 +07:00
nuluh
511014d37d feat(src): implement data processing for dataset_B and export to CSV 2025-10-16 12:32:21 +07:00
nuluh
b1be3a8b6f feat(src): add grid result summary function to export DataFrame to LaTeX 2025-10-16 12:32:20 +07:00
nuluh
df38c00935 feat(src): enhance heatmap and plotting functions for sensor data visualization 2025-10-16 12:32:15 +07:00
nuluh
c2aa68d2e9 feat(src): refactor file dir 2025-10-07 06:05:17 +07:00
nuluh
d8482988ff feat(ml): add classification report generation to model evaluation to show all metrics during training 2025-08-31 13:01:04 +07:00
nuluh
e2a4c80d49 Merge branch 'feat/103-feat-inference-function' into dev 2025-08-19 06:05:32 +07:00
nuluh
4a1c0ed83e feat(src): implement inference function with damage probability calculations and visualization
Closes #103
2025-08-17 22:21:17 +07:00
nuluh
274cd60d27 refactor(src): update generate_df_tuples function signature to include type hints for better clarity 2025-08-11 18:49:41 +07:00
nuluh
9f23d82fab fix(src): correct file writing method in process.stft.process_damage_case function to fix incorrect first column name
Closes #104
2025-08-11 13:17:46 +07:00
nuluh
a8288b1426 refactor(src): enhance compute_stft function with type hints, improved documentation by moving column renaming process from process_damage_case to compute_stft 2025-08-11 13:15:48 +07:00
nuluh
860542f3f9 refactor(src): restructure compute_stft function to be pure function and include return parameters and improve clarity 2025-08-10 20:02:45 +07:00
nuluh
3cbef17b0c feat(model_selection): add timing for model training and validation processes 2025-07-28 05:20:10 +07:00
nuluh
f6c71739df refactor(ml): clean up model_selection.py by removing unused code and improving function structure 2025-07-18 19:27:46 +07:00
nuluh
18824e05c0 refactor(ml): update inference calls to use new model structure and improve clarity 2025-07-17 00:18:01 +07:00
nuluh
2504157b29 feat(src): replace convert.py to src/data_preprocessing.py and fix some functions prefix parameter 2025-07-02 03:25:18 +07:00
nuluh
5ba628b678 refactor(src): make compute_stft and process_damage_case to be pure function that explicitly need STFT arguments to be passed 2025-07-01 14:32:52 +07:00
nuluh
c2df42cc2b feat(ml): add XGBoost model to inference options and update commented inference calls 2025-06-27 10:35:27 +07:00
nuluh
d6975b4817 feat(src): update damage base path and adjust test run logic for damage case processing for undamage case new method 2025-06-27 10:33:54 +07:00
nuluh
9921d7663b feat(src): add inference script for model evaluation 2025-06-24 14:08:38 +07:00
nuluh
5041ee3feb feat(src): add confusion matrix plotting and label percentage calculation 2025-06-24 14:06:56 +07:00
nuluh
114ab849b9 feat(src): Add confusion matrix plotting function for model evaluation 2025-06-24 00:27:15 +07:00
nuluh
a7d8f1ef56 fix(data): Fix pool mapping to include undamaged case and add csv header separator line for Excel compatibility 2025-06-18 08:25:01 +07:00
nuluh
4b0819f94e feat(notebooks): Enhance STFT notebook and model selection functionality
- Updated paths in the STFT notebook to reflect new data files.
- Improved plotting aesthetics for combined plots and added grid lines.
- Introduced a 3D spectrogram visualization for better data representation.
- Refactored model training function to include error handling and model export functionality.
- Adjusted model training calls to include export paths for saved models. Closes #90
- Added additional markdown cells for better documentation and clarity in the notebook.
2025-06-12 03:35:21 +07:00
Rifqi D. Panuluh
d151062115 Add Working Milestone with Initial Results and Model Inference (#82)
* wip: add function to create stratified train-test split from STFT data

* feat(src): implement working function for dataset B to create ready data from STFT files stft_files and add setup.py for package configuration

* feat(notebook): Update variable names for clarity, remove unused imports, and streamline data processing. Implement data concatenation using pandas concat for efficiency. Add validation steps for Dataset B and improve model training consistency across sensors.

* fix(.gitignore): add rule to ignore egg-info directories and ensure proper formatting

* docs(README): add instructions for running stft.ipynb notebook

* feat(notebook): Add evaluation metrics and confusion matrix visualizations for model predictions on Dataset B. Remove commented-out code and integrate data preparation using create_ready_data function.

---------

Co-authored-by: nuluh <dam.ar@outlook.com>
2025-05-24 01:30:10 +07:00
nuluh
a2e339a0a0 feat: Implement STFT verification for individual test runs against aggregated data 2024-12-13 16:30:06 +07:00
nuluh
2decff0cfb Closes #24
feat(stft): Implement STFT processing for vibration data with multiprocessing support to include all the data for training process instead of just using `TEST1` only
2024-12-13 16:29:08 +07:00
nuluh
9618714d3c feat: Prepare all damage cases vibration record data to be merged inside two variables "signal_sensor1" and "signal_sensor2". Closes #23 2024-10-19 15:32:05 +07:00
nuluh
2f54e91197 feat: Add absolute value option to time feature extraction 2024-09-03 15:39:44 +07:00
nuluh
57c0e03a4f docs(script): Update time-domain feature extraction to skip header row separator char info 2024-08-20 12:52:48 +07:00
nuluh
8ab934fe1c feat(features): refactor feature extraction to handle multiple files and directories
- Modify `build_features` function to support iterative processing across nested directories, enhancing the system's ability to handle larger datasets and varied input structures.
- Replace direct usage of `FeatureExtractor` class with `ExtractTimeFeatures` function, which now acts as a wrapper to include this class, facilitating streamlined integration and maintenance of feature extraction processes.
- Implement `extract_numbers` function using regex to parse filenames and extract numeric identifiers, used for labels when training with SVM
- Switch output from `.npz` to `.csv` format in `build_features`, offering better compatibility with data analysis tools and readability.
- Update documentation and comments within the code to reflect changes in functionality and usage of the new feature extraction setup.

Closes #4
2024-08-20 12:52:06 +07:00
nuluh
55db5709a9 refactor(script): Add time-domain feature extraction functionality called ExtractTimeFeatures function returning features in {dictionary} that later called in build_features.py. This function will be called for each individual .csv. Each returning value later appended in build_features.py.
This function approach rather than just assigning class ensure the flexibility and enhance maintainability.
2024-08-19 13:20:14 +07:00
nuluh
d0db65011d style 2024-08-17 11:39:46 +07:00
nuluh
a401d620eb feat(features): integrate time-domain feature extraction into data pipeline
- Implement FeatureExtractor class in time_domain_features.py for calculating statistical features from dataset columns.
- Create build_features.py script to automate feature extraction from processed data and save results in a structured format.
- Adjust build_features.py to read processed data, utilize FeatureExtractor, and save feature matrix.

This update supports enhanced analysis capabilities within the thesis-project structure, allowing for more sophisticated data processing and model training stages.

Closes #1
2024-08-12 19:45:19 +07:00
nuluh
7d39176e27 feat: Add initial time domain feature extraction class
The code changes add a new file `time_domain_features.py` that contains a `FeatureExtractor` class. This class calculates various time domain features for a given dataset. The features include mean, max, peak, peak-to-peak, RMS, variance, standard deviation, power, crest factor, form factor, pulse indicator, margin, kurtosis, and skewness.

The class takes a file path as input and reads the data from a CSV file. It assumes the data to analyze is in the first column. The calculated features are stored in a dictionary.

The commit message suggests that the purpose of the changes is to add a new class for time domain feature extraction.
2024-08-12 12:37:55 +07:00