Need Script to Generate CSV Files and Manage .gitignore #7

Closed
opened 2024-08-14 11:09:31 +00:00 by nuluh · 0 comments
nuluh commented 2024-08-14 11:09:31 +00:00 (Migrated from github.com)

Description

We need a Python script to automatically generate dummy CSV files with a specific folder structure for testing purposes. Each CSV should include a timestamp and value column, with the CSVs stored in a hierarchical folder structure under the data directory.

Requirements

  1. CSV Generation:

    • Each CSV file should have two columns: Time and Value.
    • Time should be a timestamp with millisecond precision.
    • Value should be a random float.
    • Generate ~10 rows per CSV.
    • The folder structure should include a main directory (data), with subdirectories for raw and processed. The processed directory should further include directories for different damage levels (DAMAGE_1 to DAMAGE_5), each containing 10 test CSV files (TEST1 to TEST10).
  2. Folder and File Naming:

    • Processed files should be saved as Dx_TESTy.csv where x is the damage number and y is the test number.
  3. .gitignore Configuration:

    • Ensure that all CSV files are ignored by Git to prevent them from being pushed to the repository.

Example Folder Structure

├───data
│   ├───processed
│   │   └───DAMAGE_1
│   │           D1_TEST1.csv
│   │
│   └───raw

Expected Outcome

  • A Python script that sets up the described folder structure and populates it with the specified CSV files.
  • A .gitignore file configured to ignore all CSV files.

This setup will facilitate the generation and management of test data without cluttering our repository with large data files.


Feel free to copy this markdown and use it as needed!

## Description We need a Python script to automatically generate dummy CSV files with a specific folder structure for testing purposes. Each CSV should include a timestamp and value column, with the CSVs stored in a hierarchical folder structure under the `data` directory. ## Requirements 1. **CSV Generation:** - Each CSV file should have two columns: `Time` and `Value`. - `Time` should be a timestamp with millisecond precision. - `Value` should be a random float. - Generate ~10 rows per CSV. - The folder structure should include a main directory (`data`), with subdirectories for `raw` and `processed`. The `processed` directory should further include directories for different damage levels (`DAMAGE_1` to `DAMAGE_5`), each containing 10 test CSV files (`TEST1` to `TEST10`). 2. **Folder and File Naming:** - Processed files should be saved as `Dx_TESTy.csv` where x is the damage number and y is the test number. 3. **.gitignore Configuration:** - Ensure that all CSV files are ignored by Git to prevent them from being pushed to the repository. ## Example Folder Structure ``` ├───data │ ├───processed │ │ └───DAMAGE_1 │ │ D1_TEST1.csv │ │ │ └───raw ``` ## Expected Outcome - A Python script that sets up the described folder structure and populates it with the specified CSV files. - A `.gitignore` file configured to ignore all CSV files. This setup will facilitate the generation and management of test data without cluttering our repository with large data files. --- Feel free to copy this markdown and use it as needed!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: nuluh/thesis#7