This commit adds a new script `start.sh` that automates the process of processing raw data, building features, and training a model. The script uses Python scripts from the `src` directory to perform these tasks. The processed data is saved in the `data/processed` directory, the feature matrix is saved in the `data/features` directory, and the trained model is saved in the `models` directory. The purpose of these changes is to streamline the data processing and model training workflow, making it easier to reproduce and iterate on the results.
11 lines
318 B
Bash
11 lines
318 B
Bash
#!/bin/bash
|
|
|
|
# Process raw data
|
|
python src/data/process_dataset.py data/raw/ data/processed/
|
|
|
|
# Build features
|
|
python src/features/build_features.py data/processed/processed_data.csv data/features/feature_matrix.npz
|
|
|
|
# Train model
|
|
python src/models/train_model.py data/features/feature_matrix.npz models/svm_model.pkl
|