Add 'labels' Column to Time-domain Feature Extraction DataFrame #10
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description
We need to include a 'labels' column in our feature extraction DataFrame to facilitate downstream tasks such as training machine learning models. Currently, the DataFrame generated by the
build_featuresfunction only contains extracted features, and lacks any form of labeling for these features.Expected Behavior
The DataFrame should include a 'labels' column where each row corresponds to the label of the dataset from which the features were extracted.
Current Behavior
The current implementation generates a DataFrame without a 'labels' column. This absence prevents us from using the DataFrame directly in supervised learning scenarios. Here's the DataFrame features head looks like:
Possible Solution
Modify the
build_featuresfunction to append a 'labels' column to the DataFrame. This column could be derived from the directory names or a specific pattern in the filenames, depending on how our data is structured.Steps to Reproduce
build_featuresscript with the current setup.combined_features.csvdoes not include a 'labels' column.Context (Environment)
The feature extraction is crucial for our model training, and having labeled data is necessary for any supervised learning approach. The absence of labels impacts our ability to directly train models using the extracted features.
Possible Implementation
Here's a potential snippet for how we might modify the
build_featuresfunction: