Compare commits

...

9 Commits

Author SHA1 Message Date
nuluh
41086e95ad chore: Ignore .venv/ directory and update .gitignore due to error numpy error ValueError: numpy.ndarray size changed, may indicate binary incompatibility. by creating venv. 2024-09-01 14:50:24 +07:00
nuluh
adde35ed7e feat(notebook): Normalize the data by calculating the relative value between two sensors. Along with it, MinMaxScaler and StandardScaler are applied and visualize with Seaborn's Pair Plot.
Closes #15
2024-09-01 14:50:04 +07:00
nuluh
b2684c23f6 feat(script): Add zero-padding to CSV filenames to include sensors number 2024-08-27 10:11:39 +07:00
Panuluh
8a499a04fb Merge pull request #17 from nuluh/feature/csv-padding-naming
Feature/csv padding naming
2024-08-27 09:23:44 +07:00
Panuluh
118c56c12d Merge pull request #13 from nuluh/feature/10-add-labels-column-to-time-domain-feature-extraction-dataframe
feat(notebook): add 'labels' column to feature extraction dataframe
2024-08-26 09:55:46 +07:00
nuluh
79a0f82372 feat(notebook): add 'labels' column to feature extraction dataframe
Implement extraction of 'labels' from directory names and append as a new column in the dataframe during feature extraction. Adapted from the existing `build_features.py` script to enhance data usability in supervised learning models within the Jupyter notebook environment.

Closes #10
2024-08-20 15:28:19 +07:00
Panuluh
c9415c21fa Merge pull request #9 from nuluh/feature/automate-csv-file
Closes #4
2024-08-20 13:01:42 +07:00
nuluh
3860f2cc5b fix(docs): The readme.md should belong to raw data since the script is intended to simulate raw data that coming from accelerometer sensors instead of processed data that should be generated by simulating frequency domain data instead. 2024-08-18 10:34:22 +07:00
nuluh
553140fe3c feat(script): add zero-padding to CSV filenames and change the output generated csv as raw data in raw folder 2024-08-17 19:51:42 +07:00
4 changed files with 659 additions and 72 deletions

2
.gitignore vendored
View File

@@ -1,4 +1,4 @@
# Ignore CSV files in the data directory and all its subdirectories
data/**/*.csv
.venv/
*.pyc

File diff suppressed because one or more lines are too long

View File

@@ -1,8 +1,8 @@
# Processed Data Directory
# Raw Data Directory
## Overview
This `data/processed` directory contains structured data that has been processed and formatted for analysis. Each subdirectory within `processed` represents a different level of simulated damage, and each contains multiple test files from experiments conducted under that specific damage scenario.
This `data/raw` directory contains structured data that has been processed and formatted for analysis. Each subdirectory within `raw` represents a different level of simulated damage, and each contains multiple test files from experiments conducted under that specific damage scenario.
## Directory Structure

View File

@@ -13,14 +13,23 @@ processed_path = os.path.join(base_path, "processed")
os.makedirs(raw_path, exist_ok=True)
os.makedirs(processed_path, exist_ok=True)
for damage in range(1, 6): # 5 Damage levels
damage_folder = f"DAMAGE_{damage}"
damage_path = os.path.join(processed_path, damage_folder)
# Define the number of zeros to pad
num_damages = 5
num_tests = 10
num_sensors = 2
damage_pad = len(str(num_damages))
test_pad = len(str(num_tests))
sensor_pad = len(str(num_sensors))
for damage in range(1, num_damages + 1): # 5 Damage levels starts from 1
damage_folder = f"DAMAGE_{damage:0{damage_pad}}"
damage_path = os.path.join(raw_path, damage_folder)
os.makedirs(damage_path, exist_ok=True)
for test in range(1, 11): # 10 Tests per damage level
for sensor in range(1, 3): # 2 Sensors per test
# Filename for the CSV
csv_filename = f"D{damage}_TEST{test}.csv"
csv_filename = f"D{damage:0{damage_pad}}_TEST{test:0{test_pad}}_{sensor:0{sensor_pad}}.csv"
csv_path = os.path.join(damage_path, csv_filename)
# Generate dummy data