- Consolidated import statements for pandas and matplotlib.
- Updated STFT plotting for Sensor 1 and Sensor 2 datasets with improved visualization using pcolormesh.
- Enhanced subplot organization for better clarity in visual representation.
- Added titles and adjusted layout for all plots.
- Updated paths in the STFT notebook to reflect new data files.
- Improved plotting aesthetics for combined plots and added grid lines.
- Introduced a 3D spectrogram visualization for better data representation.
- Refactored model training function to include error handling and model export functionality.
- Adjusted model training calls to include export paths for saved models. Closes#90
- Added additional markdown cells for better documentation and clarity in the notebook.
* wip: add function to create stratified train-test split from STFT data
* feat(src): implement working function for dataset B to create ready data from STFT files stft_files and add setup.py for package configuration
* feat(notebook): Update variable names for clarity, remove unused imports, and streamline data processing. Implement data concatenation using pandas concat for efficiency. Add validation steps for Dataset B and improve model training consistency across sensors.
* fix(.gitignore): add rule to ignore egg-info directories and ensure proper formatting
* docs(README): add instructions for running stft.ipynb notebook
* feat(notebook): Add evaluation metrics and confusion matrix visualizations for model predictions on Dataset B. Remove commented-out code and integrate data preparation using create_ready_data function.
---------
Co-authored-by: nuluh <dam.ar@outlook.com>
feat(stft): Implement STFT processing for vibration data with multiprocessing support to include all the data for training process instead of just using `TEST1` only
Implement extraction of 'labels' from directory names and append as a new column in the dataframe during feature extraction. Adapted from the existing `build_features.py` script to enhance data usability in supervised learning models within the Jupyter notebook environment.
Closes#10
This commit adds a new file, `.vscode/launch.json`, which contains the configuration for launching the Python debugger. The configuration includes the necessary attributes such as the debugger type, request type, program file, console type, and command-line arguments. This configuration allows developers to easily debug Python files in the integrated terminal.
- Modify `build_features` function to support iterative processing across nested directories, enhancing the system's ability to handle larger datasets and varied input structures.
- Replace direct usage of `FeatureExtractor` class with `ExtractTimeFeatures` function, which now acts as a wrapper to include this class, facilitating streamlined integration and maintenance of feature extraction processes.
- Implement `extract_numbers` function using regex to parse filenames and extract numeric identifiers, used for labels when training with SVM
- Switch output from `.npz` to `.csv` format in `build_features`, offering better compatibility with data analysis tools and readability.
- Update documentation and comments within the code to reflect changes in functionality and usage of the new feature extraction setup.
Closes#4
This commit adds a new script `start.sh` that automates the process of processing raw data, building features, and training a model. The script uses Python scripts from the `src` directory to perform these tasks. The processed data is saved in the `data/processed` directory, the feature matrix is saved in the `data/features` directory, and the trained model is saved in the `models` directory.
The purpose of these changes is to streamline the data processing and model training workflow, making it easier to reproduce and iterate on the results.