feat(notebooks): Enhance STFT notebook and model selection functionality

- Updated paths in the STFT notebook to reflect new data files. - Improved plotting aesthetics for combined plots and added grid lines. - Introduced a 3D spectrogram visualization for better data representation. - Refactored model training function to include error handling and model export functionality. - Adjusted model training calls to include export paths for saved models. Closes #90 - Added additional markdown cells for better documentation and clarity in the notebook.
fix(latex): fix image path for flowchart in methodology section
2025-06-12 03:35:21 +07:00 · 2025-06-04 15:59:13 +07:00 · 2025-06-04 15:53:57 +07:00 · 2025-06-04 15:53:35 +07:00 · 2025-06-04 15:31:00 +07:00 · 2025-06-04 11:27:56 +07:00
13 changed files with 893 additions and 362 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,5 @@
 # Ignore CSV files in the data directory and all its subdirectories
 data/**/*.csv
 .venv/
-*.pyc
+*.pyc
+*.egg-info/
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -1,3 +1,7 @@
 {
-  "python.analysis.extraPaths": ["./code/src/features"]
+  "python.analysis.extraPaths": [
+    "./code/src/features",
+    "${workspaceFolder}/code/src"
+  ],
+  "jupyter.notebookFileRoot": "${workspaceFolder}/code"
 }
--- a/README.md
+++ b/README.md
@@ -16,3 +16,8 @@ The repository is private and access is restricted only to those who have been g
 All contents of this repository, including the thesis idea, code, and associated data, are copyrighted © 2024 by Rifqi Panuluh. Unauthorized use or duplication is prohibited.

 [LICENSE](https://github.com/nuluh/thesis?tab=License-1-ov-file#readme)
+
+## How to Run `stft.ipynb`
+
+1. run `pip install -e .` in root project first
+2. run the notebook
--- a/code/notebooks/stft.ipynb
+++ b/code/notebooks/stft.ipynb
@@ -17,8 +17,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "sensor1 = pd.read_csv('D:/thesis/data/converted/raw/DAMAGE_1/DAMAGE_1_TEST1_01.csv',sep=',')\n",
-    "sensor2 = pd.read_csv('D:/thesis/data/converted/raw/DAMAGE_1/DAMAGE_1_TEST1_02.csv',sep=',')"
+    "sensor1 = pd.read_csv('D:/thesis/data/converted/raw/DAMAGE_1/DAMAGE_0_TEST1_01.csv',sep=',')\n",
+    "sensor2 = pd.read_csv('D:/thesis/data/converted/raw/DAMAGE_1/DAMAGE_0_TEST1_02.csv',sep=',')"
   ]
  },
  {
@@ -101,13 +101,16 @@
   "source": [
    "# Combined Plot for sensor 1 and sensor 2 from data1 file in which motor is operated at 800 rpm\n",
    "\n",
-    "plt.plot(df1['s2'], label='sensor 2')\n",
-    "plt.plot(df1['s1'], label='sensor 1', alpha=0.5)\n",
+    "plt.plot(df1['s2'], label='Sensor 1', color='C1', alpha=0.6)\n",
+    "plt.plot(df1['s1'], label='Sensor 2', color='C0', alpha=0.6)\n",
    "plt.xlabel(\"Number of samples\")\n",
    "plt.ylabel(\"Amplitude\")\n",
    "plt.title(\"Raw vibration signal\")\n",
    "plt.ylim(-7.5, 5)\n",
    "plt.legend()\n",
+    "plt.locator_params(axis='x', nbins=8)\n",
+    "plt.ylim(-1, 1)  # Adjust range as needed\n",
+    "plt.grid(True, linestyle='--', alpha=0.5)\n",
    "plt.show()"
   ]
  },
@@ -155,7 +158,7 @@
    "import pandas as pd\n",
    "import numpy as np\n",
    "from scipy.signal import stft, hann\n",
-    "from multiprocessing import Pool\n",
+    "# from multiprocessing import Pool\n",
    "\n",
    "# Function to compute and append STFT data\n",
    "def process_stft(args):\n",
@@ -321,9 +324,9 @@
   "source": [
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
-    "ready_data1 = []\n",
+    "ready_data1a = []\n",
    "for file in os.listdir('D:/thesis/data/converted/raw/sensor1'):\n",
-    "    ready_data1.append(pd.read_csv(os.path.join('D:/thesis/data/converted/raw/sensor1', file)))\n",
+    "    ready_data1a.append(pd.read_csv(os.path.join('D:/thesis/data/converted/raw/sensor1', file)))\n",
    "# colormesh give title x is frequency and y is time and rotate/transpose the data\n",
    "# Plotting the STFT Data"
   ]
@@ -334,8 +337,44 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ready_data1[0]\n",
-    "plt.pcolormesh(ready_data1[0])"
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "from mpl_toolkits.mplot3d import Axes3D\n",
+    "\n",
+    "# Assuming ready_data1a[0] is a DataFrame or 2D array\n",
+    "spectrogram_data = ready_data1a[0].values  # Convert to NumPy array if it's a DataFrame\n",
+    "\n",
+    "# Get the dimensions of the spectrogram\n",
+    "num_frequencies, num_time_frames = spectrogram_data.shape\n",
+    "\n",
+    "# Create frequency and time arrays\n",
+    "frequencies = np.arange(num_frequencies)  # Replace with actual frequency values if available\n",
+    "time_frames = np.arange(num_time_frames)  # Replace with actual time values if available\n",
+    "\n",
+    "# Create a meshgrid for plotting\n",
+    "T, F = np.meshgrid(time_frames, frequencies)\n",
+    "\n",
+    "# Create a 3D plot\n",
+    "fig = plt.figure(figsize=(12, 8))\n",
+    "ax = fig.add_subplot(111, projection='3d')\n",
+    "\n",
+    "# Plot the surface\n",
+    "surf = ax.plot_surface(T, F, spectrogram_data, cmap='bwr', edgecolor='none')\n",
+    "\n",
+    "# Add labels and a color bar\n",
+    "ax.set_xlabel('Time Frames')\n",
+    "ax.set_ylabel('Frequency [Hz]')\n",
+    "ax.set_zlabel('Magnitude')\n",
+    "ax.set_title('3D Spectrogram')\n",
+    "# Resize the z-axis (shrink it)\n",
+    "z_min, z_max = 0, 0.1  # Replace with your desired range\n",
+    "ax.set_zlim(z_min, z_max)\n",
+    "ax.get_proj = lambda: np.dot(Axes3D.get_proj(ax), np.diag([1, 1, 0.5, 1]))  # Shrink z-axis by 50%\n",
+    "ax.set_facecolor('white')\n",
+    "fig.colorbar(surf, ax=ax, shrink=0.5, aspect=10)\n",
+    "\n",
+    "# Show the plot\n",
+    "plt.show()"
   ]
  },
  {
@@ -344,12 +383,32 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "from cmcrameri import cm\n",
+    "# Create a figure and subplots\n",
+    "fig, axes = plt.subplots(2, 3, figsize=(15, 8), sharex=True, sharey=True)\n",
+    "\n",
+    "# Flatten the axes array for easier iteration\n",
+    "axes = axes.flatten()\n",
+    "\n",
+    "# Loop through each subplot and plot the data\n",
    "for i in range(6):\n",
-    "    plt.pcolormesh(ready_data1[i])\n",
-    "    plt.title(f'STFT Magnitude for case {i} sensor 1')\n",
-    "    plt.xlabel(f'Frequency [Hz]')\n",
-    "    plt.ylabel(f'Time [sec]')\n",
-    "    plt.show()"
+    "    pcm = axes[i].pcolormesh(ready_data1a[i].transpose(), cmap='bwr', vmax=0.03, vmin=0.0)\n",
+    "    axes[i].set_title(f'Case {i} Sensor A', fontsize=12)\n",
+    "\n",
+    "# Add a single color bar for all subplots\n",
+    "# Use the first `pcolormesh` object (or any valid one) for the color bar\n",
+    "cbar = fig.colorbar(pcm, ax=axes, orientation='vertical')\n",
+    "# cbar.set_label('Magnitude')\n",
+    "\n",
+    "# Set shared labels\n",
+    "fig.text(0.5, 0.04, 'Time Frames', ha='center', fontsize=12)\n",
+    "fig.text(0.04, 0.5, 'Frequency [Hz]', va='center', rotation='vertical', fontsize=12)\n",
+    "\n",
+    "# Adjust layout\n",
+    "# plt.tight_layout(rect=[0.05, 0.05, 1, 1])  # Leave space for shared labels\n",
+    "plt.subplots_adjust(left=0.1, right=0.75, top=0.9, bottom=0.1, wspace=0.2, hspace=0.2)\n",
+    "\n",
+    "plt.show()"
   ]
  },
  {
@@ -358,9 +417,9 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ready_data2 = []\n",
+    "ready_data2a = []\n",
    "for file in os.listdir('D:/thesis/data/converted/raw/sensor2'):\n",
-    "    ready_data2.append(pd.read_csv(os.path.join('D:/thesis/data/converted/raw/sensor2', file)))"
+    "    ready_data2a.append(pd.read_csv(os.path.join('D:/thesis/data/converted/raw/sensor2', file)))"
   ]
  },
  {
@@ -369,8 +428,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "print(len(ready_data1))\n",
-    "print(len(ready_data2))"
+    "print(len(ready_data1a))\n",
+    "print(len(ready_data2a))"
   ]
  },
  {
@@ -379,10 +438,16 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "x1 = 0\n",
-    "print(type(ready_data1[0]))\n",
-    "ready_data1[0].iloc[:,0]\n",
-    "# x1 = x1 + ready_data1[0].shape[0]"
+    "x1a = 0\n",
+    "print(type(ready_data1a[0]))\n",
+    "ready_data1a[0].iloc[:,0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Checking length of the total array"
   ]
  },
  {
@@ -391,16 +456,14 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "x1 = 0\n",
-    "print(type(x1))\n",
-    "for i in range(len(ready_data1)):\n",
-    "    # print(ready_data1[i].shape)\n",
-    "    # print(ready_data1[i].)\n",
-    "    print(type(ready_data1[i].shape[0]))\n",
-    "    x1 = x1 + ready_data1[i].shape[0]\n",
-    "    print(type(x1))\n",
+    "x1a = 0\n",
+    "print(type(x1a))\n",
+    "for i in range(len(ready_data1a)):\n",
+    "    print(type(ready_data1a[i].shape[0]))\n",
+    "    x1a = x1a + ready_data1a[i].shape[0]\n",
+    "    print(type(x1a))\n",
    "\n",
-    "print(x1)"
+    "print(x1a)"
   ]
  },
  {
@@ -409,13 +472,20 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "x2 = 0\n",
+    "x2a = 0\n",
    "\n",
-    "for i in range(len(ready_data2)):\n",
-    "    print(ready_data2[i].shape)\n",
-    "    x2 = x2 + ready_data2[i].shape[0]\n",
+    "for i in range(len(ready_data2a)):\n",
+    "    print(ready_data2a[i].shape)\n",
+    "    x2a = x2a + ready_data2a[i].shape[0]\n",
    "\n",
-    "print(x2)"
+    "print(x2a)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Flatten 6 array into one array"
   ]
  },
  {
@@ -424,28 +494,22 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "x1 = ready_data1[0]\n",
-    "# print(x1)\n",
-    "print(type(x1))\n",
-    "for i in range(len(ready_data1) - 1):\n",
-    "    #print(i)\n",
-    "    x1 = np.concatenate((x1, ready_data1[i + 1]), axis=0)\n",
-    "# print(x1)\n",
-    "pd.DataFrame(x1)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "x2 = ready_data2[0]\n",
+    "# Combine all dataframes in ready_data1a into a single dataframe\n",
+    "if ready_data1a:  # Check if the list is not empty\n",
+    "    # Use pandas concat function instead of iterative concatenation\n",
+    "    combined_data = pd.concat(ready_data1a, axis=0, ignore_index=True)\n",
+    "    \n",
+    "    print(f\"Type of combined data: {type(combined_data)}\")\n",
+    "    print(f\"Shape of combined data: {combined_data.shape}\")\n",
+    "    \n",
+    "    # Display the combined dataframe\n",
+    "    combined_data\n",
+    "else:\n",
+    "    print(\"No data available in ready_data1a list\")\n",
+    "    combined_data = pd.DataFrame()\n",
    "\n",
-    "for i in range(len(ready_data2) - 1):\n",
-    "    #print(i)\n",
-    "    x2 = np.concatenate((x2, ready_data2[i + 1]), axis=0)\n",
-    "pd.DataFrame(x2)"
+    "# Store the result in x1a for compatibility with subsequent code\n",
+    "x1a = combined_data"
   ]
  },
  {
@@ -454,20 +518,29 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "print(x1.shape)\n",
-    "print(x2.shape)"
+    "# Combine all dataframes in ready_data1a into a single dataframe\n",
+    "if ready_data2a:  # Check if the list is not empty\n",
+    "    # Use pandas concat function instead of iterative concatenation\n",
+    "    combined_data = pd.concat(ready_data2a, axis=0, ignore_index=True)\n",
+    "    \n",
+    "    print(f\"Type of combined data: {type(combined_data)}\")\n",
+    "    print(f\"Shape of combined data: {combined_data.shape}\")\n",
+    "    \n",
+    "    # Display the combined dataframe\n",
+    "    combined_data\n",
+    "else:\n",
+    "    print(\"No data available in ready_data1a list\")\n",
+    "    combined_data = pd.DataFrame()\n",
+    "\n",
+    "# Store the result in x1a for compatibility with subsequent code\n",
+    "x2a = combined_data"
   ]
  },
  {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
   "metadata": {},
-   "outputs": [],
   "source": [
-    "y_1 = [1,1,1,1]\n",
-    "y_2 = [0,1,1,1]\n",
-    "y_3 = [1,0,1,1]\n",
-    "y_4 = [1,1,0,0]"
+    "### Creating the label"
   ]
  },
  {
@@ -490,39 +563,41 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "y_data = [y_1, y_2, y_3, y_4, y_5, y_6]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "for i in range(len(y_data)):\n",
-    "    print(ready_data1[i].shape[0])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "for i in range(len(y_data)):\n",
-    "    y_data[i] = [y_data[i]]*ready_data1[i].shape[0]\n",
-    "    y_data[i] = np.array(y_data[i])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
+    "y_data = [y_1, y_2, y_3, y_4, y_5, y_6]\n",
    "y_data"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for i in range(len(y_data)):\n",
+    "    print(ready_data1a[i].shape[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "for i in range(len(y_data)):\n",
+    "    y_data[i] = [y_data[i]]*ready_data1a[i].shape[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "len(y_data[0])\n",
+    "# y_data"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -552,10 +627,20 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from sklearn.model_selection import train_test_split\n",
+    "from src.ml.model_selection import create_ready_data\n",
    "\n",
-    "x_train1, x_test1, y_train, y_test = train_test_split(x1, y, test_size=0.2, random_state=2)\n",
-    "x_train2, x_test2, y_train, y_test = train_test_split(x2, y, test_size=0.2, random_state=2)"
+    "X1a, y = create_ready_data('D:/thesis/data/converted/raw/sensor1')\n",
+    "X2a, y = create_ready_data('D:/thesis/data/converted/raw/sensor2')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "X1a.iloc[-1,:]\n",
+    "# y[2565]"
   ]
  },
  {
@@ -565,6 +650,17 @@
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
+    "\n",
+    "x_train1, x_test1, y_train, y_test = train_test_split(X1a, y, test_size=0.2, random_state=2)\n",
+    "x_train2, x_test2, y_train, y_test = train_test_split(X2a, y, test_size=0.2, random_state=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "from sklearn.ensemble import RandomForestClassifier, BaggingClassifier\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
@@ -592,130 +688,24 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "accuracies1 = []\n",
-    "accuracies2 = []\n",
+    "from src.ml.model_selection import train_and_evaluate_model\n",
+    "from sklearn.svm import SVC\n",
+    "# Define models for sensor1\n",
+    "models_sensor1 = {\n",
+    "    # \"Random Forest\": RandomForestClassifier(),\n",
+    "    # \"Bagged Trees\": BaggingClassifier(estimator=DecisionTreeClassifier(), n_estimators=10),\n",
+    "    # \"Decision Tree\": DecisionTreeClassifier(),\n",
+    "    # \"KNN\": KNeighborsClassifier(),\n",
+    "    # \"LDA\": LinearDiscriminantAnalysis(),\n",
+    "    \"SVM\": SVC(),\n",
+    "    # \"XGBoost\": XGBClassifier()\n",
+    "}\n",
    "\n",
-    "\n",
-    "# 1. Random Forest\n",
-    "rf_model = RandomForestClassifier()\n",
-    "rf_model.fit(x_train1, y_train)\n",
-    "rf_pred1 = rf_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, rf_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"Random Forest Accuracy for sensor 1:\", acc1)\n",
-    "rf_model.fit(x_train2, y_train)\n",
-    "rf_pred2 = rf_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, rf_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"Random Forest Accuracy for sensor 2:\", acc2)\n",
-    "# print(rf_pred)\n",
-    "# print(y_test)\n",
-    "\n",
-    "# 2. Bagged Trees\n",
-    "bagged_model = BaggingClassifier(estimator=DecisionTreeClassifier(), n_estimators=10)\n",
-    "bagged_model.fit(x_train1, y_train)\n",
-    "bagged_pred1 = bagged_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, bagged_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"Bagged Trees Accuracy for sensor 1:\", acc1)\n",
-    "bagged_model.fit(x_train2, y_train)\n",
-    "bagged_pred2 = bagged_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, bagged_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"Bagged Trees Accuracy for sensor 2:\", acc2)\n",
-    "\n",
-    "# 3. Decision Tree\n",
-    "dt_model = DecisionTreeClassifier()\n",
-    "dt_model.fit(x_train1, y_train)\n",
-    "dt_pred1 = dt_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, dt_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"Decision Tree Accuracy for sensor 1:\", acc1)\n",
-    "dt_model.fit(x_train2, y_train)\n",
-    "dt_pred2 = dt_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, dt_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"Decision Tree Accuracy for sensor 2:\", acc2)\n",
-    "\n",
-    "# 4. KNeighbors\n",
-    "knn_model = KNeighborsClassifier()\n",
-    "knn_model.fit(x_train1, y_train)\n",
-    "knn_pred1 = knn_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, knn_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"KNeighbors Accuracy for sensor 1:\", acc1)\n",
-    "knn_model.fit(x_train2, y_train)\n",
-    "knn_pred2 = knn_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, knn_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"KNeighbors Accuracy for sensor 2:\", acc2)\n",
-    "\n",
-    "# 5. Linear Discriminant Analysis\n",
-    "lda_model = LinearDiscriminantAnalysis()\n",
-    "lda_model.fit(x_train1, y_train)\n",
-    "lda_pred1 = lda_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, lda_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"Linear Discriminant Analysis Accuracy for sensor 1:\", acc1)\n",
-    "lda_model.fit(x_train2, y_train)\n",
-    "lda_pred2 = lda_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, lda_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"Linear Discriminant Analysis Accuracy for sensor 2:\", acc2)\n",
-    "\n",
-    "# 6. Support Vector Machine\n",
-    "svm_model = SVC()\n",
-    "svm_model.fit(x_train1, y_train)\n",
-    "svm_pred1 = svm_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, svm_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"Support Vector Machine Accuracy for sensor 1:\", acc1)\n",
-    "svm_model.fit(x_train2, y_train)\n",
-    "svm_pred2 = svm_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, svm_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"Support Vector Machine Accuracy for sensor 2:\", acc2)\n",
-    "\n",
-    "# 7. XGBoost\n",
-    "xgboost_model = XGBClassifier()\n",
-    "xgboost_model.fit(x_train1, y_train)\n",
-    "xgboost_pred1 = xgboost_model.predict(x_test1)\n",
-    "acc1 = accuracy_score(y_test, xgboost_pred1) * 100\n",
-    "accuracies1.append(acc1)\n",
-    "# format with color coded if acc1 > 90\n",
-    "acc1 = f\"\\033[92m{acc1:.2f}\\033[00m\" if acc1 > 90 else f\"{acc1:.2f}\"\n",
-    "print(\"XGBoost Accuracy:\", acc1)\n",
-    "xgboost_model.fit(x_train2, y_train)\n",
-    "xgboost_pred2 = xgboost_model.predict(x_test2)\n",
-    "acc2 = accuracy_score(y_test, xgboost_pred2) * 100\n",
-    "accuracies2.append(acc2)\n",
-    "# format with color coded if acc2 > 90\n",
-    "acc2 = f\"\\033[92m{acc2:.2f}\\033[00m\" if acc2 > 90 else f\"{acc2:.2f}\"\n",
-    "print(\"XGBoost Accuracy:\", acc2)"
+    "results_sensor1 = []\n",
+    "for name, model in models_sensor1.items():\n",
+    "    res = train_and_evaluate_model(model, name, \"sensor1\", x_train1, y_train, x_test1, y_test, export='D:/thesis/models/sensor1')\n",
+    "    results_sensor1.append(res)\n",
+    "    print(f\"{name} on sensor1: Accuracy = {res['accuracy']:.2f}%\")\n"
   ]
  },
  {
@@ -724,8 +714,35 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "print(accuracies1)\n",
-    "print(accuracies2)"
+    "models_sensor2 = {\n",
+    "    # \"Random Forest\": RandomForestClassifier(),\n",
+    "    # \"Bagged Trees\": BaggingClassifier(estimator=DecisionTreeClassifier(), n_estimators=10),\n",
+    "    # \"Decision Tree\": DecisionTreeClassifier(),\n",
+    "    # \"KNN\": KNeighborsClassifier(),\n",
+    "    # \"LDA\": LinearDiscriminantAnalysis(),\n",
+    "    \"SVM\": SVC(),\n",
+    "    # \"XGBoost\": XGBClassifier()\n",
+    "}\n",
+    "\n",
+    "results_sensor2 = []\n",
+    "for name, model in models_sensor2.items():\n",
+    "    res = train_and_evaluate_model(model, name, \"sensor2\", x_train2, y_train, x_test2, y_test, export='D:/thesis/models/sensor2')\n",
+    "    results_sensor2.append(res)\n",
+    "    print(f\"{name} on sensor2: Accuracy = {res['accuracy']:.2f}%\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "all_results = {\n",
+    "    \"sensor1\": results_sensor1,\n",
+    "    \"sensor2\": results_sensor2\n",
+    "}\n",
+    "\n",
+    "print(all_results)"
   ]
  },
  {
@@ -737,36 +754,48 @@
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
-    "models = [rf_model, bagged_model, dt_model, knn_model, lda_model, svm_model, xgboost_model]\n",
-    "model_names = [\"Random Forest\", \"Bagged Trees\", \"Decision Tree\", \"KNN\", \"LDA\", \"SVM\", \"XGBoost\"]\n",
+    "def prepare_plot_data(results_dict):\n",
+    "    # Gather unique model names\n",
+    "    models_set = {entry['model'] for sensor in results_dict.values() for entry in sensor}\n",
+    "    models = sorted(list(models_set))\n",
+    "    \n",
+    "    # Create dictionaries mapping sensor -> accuracy list ordered by model name\n",
+    "    sensor_accuracies = {}\n",
+    "    for sensor, entries in results_dict.items():\n",
+    "        # Build a mapping: model -> accuracy for the given sensor\n",
+    "        mapping = {entry['model']: entry['accuracy'] for entry in entries}\n",
+    "        # Order the accuracies consistent with the sorted model names\n",
+    "        sensor_accuracies[sensor] = [mapping.get(model, 0) for model in models]\n",
+    "    \n",
+    "    return models, sensor_accuracies\n",
    "\n",
-    "bar_width = 0.35  # Width of each bar\n",
-    "index = np.arange(len(model_names))  # Index for the bars\n",
+    "def plot_accuracies(models, sensor_accuracies):\n",
+    "    bar_width = 0.35\n",
+    "    x = np.arange(len(models))\n",
+    "    sensors = list(sensor_accuracies.keys())\n",
+    "    \n",
+    "    plt.figure(figsize=(10, 6))\n",
+    "    # Assume two sensors for plotting grouped bars\n",
+    "    plt.bar(x - bar_width/2, sensor_accuracies[sensors[0]], width=bar_width, color='blue', label=sensors[0])\n",
+    "    plt.bar(x + bar_width/2, sensor_accuracies[sensors[1]], width=bar_width, color='orange', label=sensors[1])\n",
+    "    \n",
+    "    # Add text labels on top of bars\n",
+    "    for i, (a1, a2) in enumerate(zip(sensor_accuracies[sensors[0]], sensor_accuracies[sensors[1]])):\n",
+    "        plt.text(x[i] - bar_width/2, a1 + 0.1, f\"{a1:.2f}%\", ha='center', va='bottom', color='black')\n",
+    "        plt.text(x[i] + bar_width/2, a2 + 0.1, f\"{a2:.2f}%\", ha='center', va='bottom', color='black')\n",
+    "    \n",
+    "    plt.xlabel('Model Name')\n",
+    "    plt.ylabel('Accuracy (%)')\n",
+    "    plt.title('Accuracy of Classifiers for Each Sensor')\n",
+    "    plt.xticks(x, models)\n",
+    "    plt.legend()\n",
+    "    plt.ylim(0, 105)\n",
+    "    plt.tight_layout()\n",
+    "    plt.show()\n",
    "\n",
-    "# Plotting the bar graph\n",
-    "plt.figure(figsize=(14, 8))\n",
-    "\n",
-    "# Bar plot for Sensor 1\n",
-    "plt.bar(index, accuracies1, width=bar_width, color='blue', label='Sensor 1')\n",
-    "\n",
-    "# Bar plot for Sensor 2\n",
-    "plt.bar(index + bar_width, accuracies2, width=bar_width, color='orange', label='Sensor 2')\n",
-    "\n",
-    "# Add values on top of each bar\n",
-    "for i, acc1, acc2 in zip(index, accuracies1, accuracies2):\n",
-    "    plt.text(i, acc1 + .1, f'{acc1:.2f}%', ha='center', va='bottom', color='black')\n",
-    "    plt.text(i + bar_width, acc2 + 1, f'{acc2:.2f}%', ha='center', va='bottom', color='black')\n",
-    "\n",
-    "# Customize the plot\n",
-    "plt.xlabel('Model Name →')\n",
-    "plt.ylabel('Accuracy →')\n",
-    "plt.title('Accuracy of classifiers for Sensors 1 and 2 with 513 features')\n",
-    "plt.xticks(index + bar_width / 2, model_names)  # Set x-tick positions\n",
-    "plt.legend()\n",
-    "plt.ylim(0, 100)\n",
-    "\n",
-    "# Show the plot\n",
-    "plt.show()\n"
+    "# Use the functions\n",
+    "models, sensor_accuracies = prepare_plot_data(all_results)\n",
+    "plot_accuracies(models, sensor_accuracies)\n"
   ]
  },
  {
@@ -787,51 +816,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "def spectograph(data_dir: str):\n",
-    "    # print(os.listdir(data_dir))\n",
-    "    for damage in os.listdir(data_dir):\n",
-    "        # print(damage)\n",
-    "        d = os.path.join(data_dir, damage)\n",
-    "        # print(d)\n",
-    "        for file in os.listdir(d):\n",
-    "            # print(file)\n",
-    "            f = os.path.join(d, file)\n",
-    "            print(f)\n",
-    "            # sensor1 = pd.read_csv(f, skiprows=1, sep=';')\n",
-    "            # sensor2 = pd.read_csv(f, skiprows=1, sep=';')\n",
+    "from src.ml.model_selection import create_ready_data\n",
    "\n",
-    "            # df1 = pd.DataFrame()\n",
-    "\n",
-    "            # df1['s1'] = sensor1[sensor1.columns[-1]]\n",
-    "            # df1['s2'] = sensor2[sensor2.columns[-1]]\n",
-    "ed\n",
-    "            # # Combined Plot for sensor 1 and sensor 2 from data1 file in which motor is operated at 800 rpm\n",
-    "\n",
-    "            # plt.plot(df1['s2'], label='sensor 2')\n",
-    "            # plt.plot(df1['s1'], label='sensor 1')\n",
-    "            # plt.xlabel(\"Number of samples\")\n",
-    "            # plt.ylabel(\"Amplitude\")\n",
-    "            # plt.title(\"Raw vibration signal\")\n",
-    "            # plt.legend()\n",
-    "            # plt.show()\n",
-    "\n",
-    "            # from scipy import signal\n",
-    "            # from scipy.signal.windows import hann\n",
-    "\n",
-    "            # vibration_data = df1['s1']\n",
-    "\n",
-    "            # # Applying STFT\n",
-    "            # window_size = 1024\n",
-    "            # hop_size = 512\n",
-    "            # window = hann(window_size)  # Creating a Hanning window\n",
-    "            # frequencies, times, Zxx = signal.stft(vibration_data, window=window, nperseg=window_size, noverlap=window_size - hop_size)\n",
-    "\n",
-    "            # # Plotting the STFT Data\n",
-    "            # plt.pcolormesh(times, frequencies, np.abs(Zxx), shading='gouraud')\n",
-    "            # plt.title(f'STFT Magnitude for case 1 signal sensor 1 ')\n",
-    "            # plt.ylabel('Frequency [Hz]')\n",
-    "            # plt.xlabel('Time [sec]')\n",
-    "            # plt.show()"
+    "X1b, y = create_ready_data('D:/thesis/data/converted/raw_B/sensor1')\n",
+    "X2b, y = create_ready_data('D:/thesis/data/converted/raw_B/sensor2')"
   ]
  },
  {
@@ -840,7 +828,141 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "spectograph('D:/thesis/data/converted/raw')"
+    "y.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.metrics import accuracy_score, classification_report\n",
+    "# 4. Validate on Dataset B\n",
+    "from joblib import load\n",
+    "svm_model = load('D:/thesis/models/sensor1/SVM.joblib')\n",
+    "y_pred_svm = svm_model.predict(X1b)\n",
+    "\n",
+    "# 5. Evaluate\n",
+    "print(\"Accuracy on Dataset B:\", accuracy_score(y, y_pred_svm))\n",
+    "print(classification_report(y, y_pred_svm))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Model sensor 1 to predict sensor 2 data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.metrics import accuracy_score, classification_report\n",
+    "# 4. Validate on Dataset B\n",
+    "from joblib import load\n",
+    "svm_model = load('D:/thesis/models/sensor1/SVM.joblib')\n",
+    "y_pred_svm = svm_model.predict(X2b)\n",
+    "\n",
+    "# 5. Evaluate\n",
+    "print(\"Accuracy on Dataset B:\", accuracy_score(y, y_pred_svm))\n",
+    "print(classification_report(y, y_pred_svm))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.metrics import accuracy_score, classification_report\n",
+    "# 4. Validate on Dataset B\n",
+    "y_pred = rf_model2.predict(X2b)\n",
+    "\n",
+    "# 5. Evaluate\n",
+    "print(\"Accuracy on Dataset B:\", accuracy_score(y, y_pred))\n",
+    "print(classification_report(y, y_pred))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y_predict = svm_model2.predict(X2b.iloc[[5312],:])\n",
+    "print(y_predict)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y[5312]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Confusion Matrix"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay\n",
+    "\n",
+    "\n",
+    "cm = confusion_matrix(y, y_pred_svm) # -> ndarray\n",
+    "\n",
+    "# get the class labels\n",
+    "labels = svm_model.classes_\n",
+    "\n",
+    "# Plot\n",
+    "disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)\n",
+    "disp.plot(cmap=plt.cm.Blues)  # You can change colormap\n",
+    "plt.title(\"SVM Sensor1 CM Train w/ Dataset A Val w/ Dataset B from Sensor2 readings\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Self-test CM"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# 1. Predict sensor 1 on Dataset A\n",
+    "y_test_pred = svm_model.predict(x_test1)\n",
+    "\n",
+    "# 2. Import confusion matrix tools\n",
+    "from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "# 3. Create and plot confusion matrix\n",
+    "cm_train = confusion_matrix(y_test, y_test_pred)\n",
+    "labels = svm_model.classes_\n",
+    "\n",
+    "disp = ConfusionMatrixDisplay(confusion_matrix=cm_train, display_labels=labels)\n",
+    "disp.plot(cmap=plt.cm.Blues)\n",
+    "plt.title(\"Confusion Matrix: Train & Test on Dataset A\")\n",
+    "plt.show()\n"
   ]
  }
 ],
--- a/code/src/ml/init.py
+++ b/code/src/ml/init.py
--- a/code/src/ml/model_selection.py
+++ b/code/src/ml/model_selection.py
@@ -0,0 +1,155 @@
+import numpy as np
+import pandas as pd
+import os
+from sklearn.model_selection import train_test_split as sklearn_split
+
+
+def create_ready_data(
+    stft_data_path: str,
+    stratify: np.ndarray = None,
+) -> tuple:
+    """
+    Create a stratified train-test split from STFT data.
+
+    Parameters:
+    -----------
+    stft_data_path : str
+        Path to the directory containing STFT data files (e.g. 'data/converted/raw/sensor1')
+    stratify : np.ndarray, optional
+        Labels to use for stratified sampling
+
+    Returns:
+    --------
+    tuple
+        (X_train, X_test, y_train, y_test) - Split datasets
+    """
+    ready_data = []
+    for file in os.listdir(stft_data_path):
+        ready_data.append(pd.read_csv(os.path.join(stft_data_path, file)))
+
+    y_data = [i for i in range(len(ready_data))]
+
+    # Combine all dataframes in ready_data into a single dataframe
+    if ready_data:  # Check if the list is not empty
+        # Use pandas concat function instead of iterative concatenation
+        combined_data = pd.concat(ready_data, axis=0, ignore_index=True)
+
+        print(f"Type of combined data: {type(combined_data)}")
+        print(f"Shape of combined data: {combined_data.shape}")
+    else:
+        print("No data available in ready_data list")
+        combined_data = pd.DataFrame()
+
+    # Store the result in x1a for compatibility with subsequent code
+    X = combined_data
+
+    for i in range(len(y_data)):
+        y_data[i] = [y_data[i]] * ready_data[i].shape[0]
+        y_data[i] = np.array(y_data[i])
+
+    if y_data:
+        # Use numpy concatenate function instead of iterative concatenation
+        y = np.concatenate(y_data, axis=0)
+    else:
+        print("No labels available in y_data list")
+        y = np.array([])
+
+    return X, y
+
+
+def train_and_evaluate_model(
+    model, model_name, sensor_label, x_train, y_train, x_test, y_test, export=None
+):
+    """
+    Train a machine learning model, evaluate its performance, and optionally export it.
+
+    This function trains the provided model on the training data, evaluates its
+    performance on test data using accuracy score, and can save the trained model
+    to disk if an export path is provided.
+
+    Parameters
+    ----------
+    model : estimator object
+        The machine learning model to train.
+    model_name : str
+        Name of the model, used for the export filename and in the returned results.
+    sensor_label : str
+        Label identifying which sensor's data the model is being trained on.
+    x_train : array-like or pandas.DataFrame
+        The training input samples.
+    y_train : array-like
+        The target values for training.
+    x_test : array-like or pandas.DataFrame
+        The test input samples.
+    y_test : array-like
+        The target values for testing.
+    export : str, optional
+        Directory path where the trained model should be saved. If None, model won't be saved.
+
+    Returns
+    -------
+    dict
+        Dictionary containing:
+        - 'model': model_name (str)
+        - 'sensor': sensor_label (str)
+        - 'accuracy': accuracy percentage (float)
+
+    Example
+    -------
+    >>> from sklearn.svm import SVC
+    >>> from sklearn.model_selection import train_test_split
+    >>> X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2)
+    >>> result = train_and_evaluate_model(
+    ...     SVC(),
+    ...     "SVM",
+    ...     "sensor1",
+    ...     X_train,
+    ...     y_train,
+    ...     X_test,
+    ...     y_test,
+    ...     export="models/sensor1"
+    ... )
+    >>> print(f"Model accuracy: {result['accuracy']:.2f}%")
+    """
+    from sklearn.metrics import accuracy_score
+
+    result = {"model": model_name, "sensor": sensor_label, "success": False}
+
+    try:
+        # Train the model
+        model.fit(x_train, y_train)
+
+        try:
+            y_pred = model.predict(x_test)
+        except Exception as e:
+            result["error"] = f"Prediction error: {str(e)}"
+            return result
+
+        # Calculate accuracy
+        try:
+            accuracy = accuracy_score(y_test, y_pred) * 100
+            result["accuracy"] = accuracy
+        except Exception as e:
+            result["error"] = f"Accuracy calculation error: {str(e)}"
+            return result
+
+        # Export model if requested
+        if export:
+            try:
+                import joblib
+
+                full_path = os.path.join(export, f"{model_name}.joblib")
+                os.makedirs(os.path.dirname(full_path), exist_ok=True)
+                joblib.dump(model, full_path)
+                print(f"Model saved to {full_path}")
+            except Exception as e:
+                print(f"Warning: Failed to export model to {export}: {str(e)}")
+                result["export_error"] = str(e)
+                # Continue despite export error
+
+        result["success"] = True
+        return result
+
+    except Exception as e:
+        result["error"] = f"Training error: {str(e)}"
+        return result
--- a/latex/chapters/id/03_methodology/steps/index.tex
+++ b/latex/chapters/id/03_methodology/steps/index.tex
@@ -3,7 +3,7 @@ Alur keseluruhan penelitian ini dilakukan melalui tahapan-tahapan sebagai beriku

 \begin{figure}[H]
    \centering
-    \includegraphics[width=0.3\linewidth]{chapters/id/flow.png}
+    \includegraphics[width=0.3\linewidth]{chapters/img/flow.png}
    \caption{Diagram alir tahapan penelitian}
    \label{fig:flowchart}
 \end{figure}
--- a/latex/frontmatter/acknowledgement.tex
+++ b/latex/frontmatter/acknowledgement.tex
--- a/latex/frontmatter/glossaries.tex
+++ b/latex/frontmatter/glossaries.tex
@@ -0,0 +1,78 @@
+% % A new command that enables us to enter bi-lingual (Slovene and English) terms
+% % syntax: \addterm[options]{label}{Slovene}{Slovene first use}{English}{Slovene
+% % description}
+% \newcommand{\addterm}[6][]{
+%   \newglossaryentry{#2}{
+%     name={#3 (angl.\ #5)},
+%     first={#4 (\emph{#5})},
+%     text={#3},
+%     sort={#3},
+%     description={#6},
+%     #1 % pass additional options to \newglossaryentry
+%   }
+% }
+
+% % A new command that enables us to enter (English) acronyms with bi-lingual
+% % (Slovene and English) long versions
+% % syntax: \addacronym[options]{label}{abbreviation}{Slovene long}{Slovene first
+% % use long}{English long}{Slovene description}
+% \newcommand{\addacronym}[7][]{
+%   % Create the main glossary entry with \newacronym
+%   % \newacronym[key-val list]{label}{abbrv}{long}
+%   \newacronym[
+%     name={#4 (angl.\ #6,\ #3)},
+%     first={\emph{#5} (angl.\ \emph{#6},\ \emph{#3})},
+%     sort={#4},
+%     description={#7},
+%     #1 % pass additional options to \newglossaryentry
+%     ]
+%     {#2}{#3}{#4}
+%   % Create a cross-reference from the abbreviation to the main glossary entry by
+%   % creating an auxiliary glossary entry (note: we set the label of this entry
+%   % to '<original label>_auxiliary' to avoid clashes)
+%   \newglossaryentry{#2_auxiliary}{
+%     name={#3},
+%     sort={#3},
+%     description={\makefirstuc{#6}},
+%     see=[See:]{#2}
+%   }
+% }
+
+% % Change the text of the cross-reference links to the Slovene long version.
+% \renewcommand*{\glsseeitemformat}[1]{\emph{\acrlong{#1}}.}
+
+% Define the Indonesian term and link it to the English term
+\newglossaryentry{jaringansaraf}{
+  name=Jaringan Saraf,
+  description={The Indonesian term for \gls{nn}}
+}
+% \newglossaryentry{pemelajaranmesin}{
+%   name=Pemelajaran Mesin,
+%   description={Lihat \gls{machinelearning}}
+% }
+
+% Define the English term and link it to its acronym
+\newglossaryentry{neuralnetwork}{
+  name=Neural Network,
+  description={A computational model inspired by the human brain, see \gls{nn}}
+}
+
+% \newglossaryentry{machinelearning}{
+%   name=Machine Learning,
+%   description={A program or system that trains a model from input data. The trained model can make useful predictions from new (never-before-seen) data drawn from the same distribution as the one used to train the model.}}
+% \newglossaryentry{pemelajaranmesin}{
+%     name={pemelajaran mesin (angl.\ #5)},
+%     first={pemelajaran mesin (\emph{machine learning})},
+%     text={pemelajaran mesin},
+%     sort={ },
+%     description={#6},
+%     #1 % pass additional options to \newglossaryentry
+% }
+\longnewglossaryentry{machinelearning}{name={machine learning}}
+{A program or system that trains a model from input data. The trained model can make useful predictions from new (never-before-seen) data drawn from the same distribution as the one used to train the model.}
+\newterm[see={machinelearning}]{pemelajaranmesin}
+% \newglossaryentry{pemelajaran mesin}{}
+% \addterm{machinelearning}{pemelajaran mesin}{pemelajaran mesin}{machine learning}{A program or system that trains a model from input data. The trained model can make useful predictions from new (never-before-seen) data drawn from the same distribution as the one used to train the model.}
+\newacronym
+ [description={statistical pattern recognition technique}]
+ {svm}{SVM}{support vector machine}
--- a/latex/main.tex
+++ b/latex/main.tex
@@ -1,14 +1,18 @@
 \documentclass[draftmark]{thesis}

-% Title Information
-\setthesisinfo
-  {Prediksi Lokasi Kerusakan dengan Machine Learning}
-  {Rifqi Damar Panuluh}
-  {20210110224}
-  {PROGRAM STUDI TEKNIK SIPIL}
-  {FAKULTAS TEKNIK}
-  {UNIVERSITAS MUHAMMADIYAH YOGYAKARTA}
-  {2025}
+% Metadata
+\title{Prediksi Lokasi Kerusakan dengan Machine Learning}
+\author{Rifqi Damar Panuluh}
+\date{\today}
+\authorid{20210110224}
+\firstadvisor{Ir. Muhammad Ibnu Syamsi, Ph.D.}
+\secondadvisor{}
+\headdepartement{Puji Harsanto, S.T., M.T., Ph.D.}
+\headdepartementid{19740607201404123064}
+\faculty{Fakultas Teknik}
+\program{Program Studi Teknik Sipil}
+\university{Universitas Muhammadiyah Yogyakarta}
+\yearofsubmission{2025}

 % Input preamble
 \input{preamble/packages}
@@ -16,22 +20,19 @@
 \input{preamble/macros}

 \begin{document}
-
-\maketitle
+% \input{frontmatter/maketitle}
+% \input{frontmatter/maketitle_secondary}
 \frontmatter
-\input{frontmatter/approval}\clearpage
-\input{frontmatter/originality}\clearpage
-\input{frontmatter/acknowledgement}\clearpage
-\tableofcontents
+% \input{frontmatter/approval}\clearpage
+% \input{frontmatter/originality}\clearpage
+% \input{frontmatter/acknowledgement}\clearpage
+% \tableofcontents
 \clearpage
 \mainmatter
 \pagestyle{fancyplain}
-% Include content
-\include{content/abstract}
-\include{content/introduction}
 \include{chapters/01_introduction}
-\include{content/chapter2}
-\include{content/conclusion}
+\include{chapters/id/02_literature_review/index}
+\include{chapters/id/03_methodology/index}

 % Bibliography
 % \bibliographystyle{IEEEtran}
--- a/latex/metadata.tex
+++ b/latex/metadata.tex
@@ -1,11 +0,0 @@
-\newcommand{\studentname}{Rifqi Damar Panuluh}
-\newcommand{\studentid}{20210110224}
-\newcommand{\thesistitle}{Prediksi Lokasi Kerusakan dengan Machine Learning}
-\newcommand{\firstadvisor}{Ir. Muhammad Ibnu Syamsi, Ph.D.}
-\newcommand{\secondadvisor}{}
-\newcommand{\headdepartement}{Puji Harsanto, S.T. M.T., Ph.D.}
-\newcommand{\headdepartementid}{19740607201404123064}
-\newcommand{\faculty}{Fakultas Teknik}
-\newcommand{\program}{Teknik Sipil}
-\newcommand{\university}{Universitas Muhammadiyah Yogyakarta}
-\newcommand{\yearofsubmission}{2025}
--- a/latex/thesis.cls
+++ b/latex/thesis.cls
@@ -1,7 +1,7 @@
 \NeedsTeXFormat{LaTeX2e}
 \ProvidesClass{thesis}[2025/05/10 Bachelor Thesis Class]

-\newif\if@draftmark
+\newif\if@draftmark \@draftmarkfalse
 \@draftmarkfalse

 \DeclareOption{draftmark}{\@draftmarktrue}
@@ -12,6 +12,7 @@
 \RequirePackage{polyglossia}
 \RequirePackage{fontspec}
 \RequirePackage{titlesec}
+\RequirePackage{titling}
 \RequirePackage{fancyhdr}
 \RequirePackage{geometry}
 \RequirePackage{setspace}
@@ -24,30 +25,31 @@
 \RequirePackage{svg}           % Allows including SVG images directly
 \RequirePackage{indentfirst}   % Makes first paragraph after headings indented
 \RequirePackage{float}         % Provides [H] option to force figure/table placement
-
+\RequirePackage[style=apa, backend=biber]{biblatex}
+\RequirePackage[acronym, nogroupskip, toc]{glossaries}
 % Polyglossia set language
-+ \setdefaultlanguage[variant=indonesian]{malay}  % Proper Indonesian language setup
-+ \setotherlanguage{english}             % Enables English as secondary language
-
-+ \DefineBibliographyStrings{english}{%  % Customizes bibliography text
-+   andothers={dkk\adddot},              % Changes "et al." to "dkk."
-+   pages={hlm\adddot},                  % Changes "pp." to "hlm."
-+ }
+\setdefaultlanguage[variant=indonesian]{malay}  % Proper Indonesian language setup
+\setotherlanguage{english}             % Enables English as secondary language
+\DefineBibliographyStrings{english}{%  % Customizes bibliography text
+  andothers={dkk\adddot},              % Changes "et al." to "dkk."
+  pages={hlm\adddot},                  % Changes "pp." to "hlm."
+}

 % Conditionally load the watermark package and settings
 \if@draftmark
  \RequirePackage{draftwatermark}
-  \SetWatermarkText{nuluh/thesis (wip) draft: \today}
+  \SetWatermarkText{nuluh/thesis (wip) [draft: \today]}
  \SetWatermarkColor[gray]{0.8}                    % Opacity: 0.8 = 20% transparent  
  \SetWatermarkFontSize{1.5cm}
  \SetWatermarkAngle{90}
  \SetWatermarkHorCenter{1.5cm}
+  \RequirePackage[left]{lineno}
+  \linenumbers
 \fi

 % Page layout
-\geometry{left=3cm, top=3cm, right=3cm, bottom=3cm}
+\geometry{left=4cm, top=3cm, right=3cm, bottom=3cm}
 \setlength{\parskip}{0.5em}
-\setlength{\parindent}{0pt}
 \onehalfspacing

 % Fonts
@@ -56,19 +58,45 @@
 \setsansfont{Arial}
 \setmonofont{Courier New}

-% Metadata commands
-\input{metadata}
-
-\newcommand{\setthesisinfo}[7]{%
-  \renewcommand{\thesistitle}{#1}%
-  \renewcommand{\studentname}{#2}%
-  \renewcommand{\studentid}{#3}%
-  \renewcommand{\program}{#4}%
-  \renewcommand{\faculty}{#5}%
-  \renewcommand{\university}{#6}%
-  \renewcommand{\yearofsubmission}{#7}%
+\makeatletter
+% Extracting the Year from \today
+\newcommand{\theyear}{%
+  \expandafter\@car\expandafter\@gobble\the\year\@nil
 }

+% Declare internal macros as initially empty
+\newcommand{\@authorid}{}
+\newcommand{\@firstadvisor}{}
+\newcommand{\@secondadvisor}{}
+\newcommand{\@headdepartement}{}
+\newcommand{\@headdepartementid}{}
+\newcommand{\@faculty}{}
+\newcommand{\@program}{}
+\newcommand{\@university}{}
+\newcommand{\@yearofsubmission}{}
+
+% Define user commands to set these values.
+\newcommand{\authorid}[1]{\gdef\@authorid{#1}}
+\newcommand{\firstadvisor}[1]{\gdef\@firstadvisor{#1}}
+\newcommand{\secondadvisor}[1]{\gdef\@secondadvisor{#1}}
+\newcommand{\headdepartement}[1]{\gdef\@headdepartement{#1}}
+\newcommand{\headdepartementid}[1]{\gdef\@headdepartementid{#1}}
+\newcommand{\faculty}[1]{\gdef\@faculty{#1}}
+\newcommand{\program}[1]{\gdef\@program{#1}}
+\newcommand{\university}[1]{\gdef\@university{#1}}
+\newcommand{\yearofsubmission}[1]{\gdef\@yearofsubmission{#1}}
+
+% Now expose robust “the‑” getters to access the values
+\newcommand{\theauthorid}{\@authorid}
+\newcommand{\thefirstadvisor}{\@firstadvisor}
+\newcommand{\thesecondadvisor}{\@secondadvisor}
+\newcommand{\theheaddepartement}{\@headdepartement}
+\newcommand{\theheaddepartementid}{\@headdepartementid}
+\newcommand{\thefaculty}{\@faculty}
+\newcommand{\theprogram}{\@program}
+\newcommand{\theuniversity}{\@university}
+\newcommand{\theyearofsubmission}{\@yearofsubmission}
+\makeatother
 % % Header and footer
 \fancypagestyle{fancy}{%
    \fancyhf{}
@@ -110,11 +138,6 @@
 \renewcommand{\cftchappresnum}{BAB~}
 \renewcommand{\cftchapaftersnum}{\quad}

-% \titlespacing*{\chapter}{0pt}{-10pt}{20pt}
-
-% Redefine \maketitle
-\renewcommand{\maketitle}{\input{frontmatter/maketitle}}
-
 % Chapter & Section format
 \renewcommand{\cftchapfont}{\normalsize\MakeUppercase}
 % \renewcommand{\cftsecfont}{}
@@ -136,11 +159,15 @@
 \setlength{\cftsubsecnumwidth}{2.5em}
 \setlength{\cftfignumwidth}{5em}
 \setlength{\cfttabnumwidth}{4em}
-\renewcommand \cftchapdotsep{1}           % Denser dots (closer together) https://tex.stackexchange.com/a/273764
-\renewcommand \cftsecdotsep{1}            % Apply to sections too
-\renewcommand \cftsubsecdotsep{1}         % Apply to subsections too
+\renewcommand \cftchapdotsep{1} % https://tex.stackexchange.com/a/273764
+\renewcommand \cftsecdotsep{1} % https://tex.stackexchange.com/a/273764
+\renewcommand \cftsubsecdotsep{1} % https://tex.stackexchange.com/a/273764
+\renewcommand \cftfigdotsep{1.5} % https://tex.stackexchange.com/a/273764
+\renewcommand \cfttabdotsep{1.5} % https://tex.stackexchange.com/a/273764
 \renewcommand{\cftchapleader}{\normalfont\cftdotfill{\cftsecdotsep}}
 \renewcommand{\cftchappagefont}{\normalfont}
+
+% Add Prefix in the Lof and LoT entries
 \renewcommand{\cftfigpresnum}{\figurename~}
 \renewcommand{\cfttabpresnum}{\tablename~}

@@ -165,6 +192,147 @@
 % \renewcommand{\cfttoctitlefont}{\bfseries\MakeUppercase}
 % \renewcommand{\cftaftertoctitle}{\vskip 2em}

+% Defines a new glossary called “notation”
+\newglossary[nlg]{notation}{not}{ntn}{Notation}
+
+% Define the header for the location column
+\providecommand*{\locationname}{Location}
+
+% Define the new glossary style called 'mylistalt' for main glossaries
+\makeatletter
+\newglossarystyle{mylistalt}{%
+  % start the list, initializing glossaries internals
+  \renewenvironment{theglossary}%
+    {\glslistinit\begin{enumerate}}%
+    {\end{enumerate}}%
+  % suppress all headers/groupskips
+  \renewcommand*{\glossaryheader}{}%
+  \renewcommand*{\glsgroupheading}[1]{}%
+  \renewcommand*{\glsgroupskip}{}%
+  % main entries: let \item produce "1." etc., then break
+  \renewcommand*{\glossentry}[2]{%
+    \item \glstarget{##1}{\glossentryname{##1}}%
+    \mbox{}\\
+    \glossentrydesc{##1}\space 
+    [##2] % appears on page x
+  }%
+  % sub-entries as separate paragraphs, still aligned
+  \renewcommand*{\subglossentry}[3]{%
+    \par
+    \glssubentryitem{##2}%
+    \glstarget{##2}{\strut}\space
+    \glossentrydesc{##2}\space ##3%
+  }%
+}
+
+
+% Define the new glossary style 'altlong3customheader' for notation
+\newglossarystyle{altlong3customheader}{%
+  % The glossary will be a longtable environment with three columns:
+  % 1. Symbol (left-aligned)
+  % 2. Description (paragraph, width \glsdescwidth)
+  % 3. Location (paragraph, width \glspagelistwidth)
+  \renewenvironment{theglossary}%
+    {\begin{longtable}{lp{\glsdescwidth}p{\glspagelistwidth}}}%
+    {\end{longtable}}%
+  % Define the table header row
+  \renewcommand*{\symbolname}{Simbol}
+  \renewcommand*{\descriptionname}{Keterangan}
+  \renewcommand*{\locationname}{Halaman}
+  \renewcommand*{\glossaryheader}{%
+    \bfseries\symbolname & \bfseries\descriptionname & \bfseries\locationname \tabularnewline\endhead}%
+  % Suppress group headings (e.g., A, B, C...)
+  \renewcommand*{\glsgroupheading}[1]{}%
+  % Define how a main glossary entry is displayed
+  % ##1 is the entry label
+  % ##2 is the location list (page numbers)
+  \renewcommand{\glossentry}[2]{%
+    \glsentryitem{##1}% Inserts entry number if entrycounter option is used
+    \glstarget{##1}{\glossentryname{##1}} & % Column 1: Symbol (with hyperlink target)
+    \glossentrydesc{##1}\glspostdescription & % Column 2: Description (with post-description punctuation)
+    ##2\tabularnewline % Column 3: Location list
+  }%
+  % Define how a sub-entry is displayed
+  % ##1 is the sub-entry level (e.g., 1 for first sub-level)
+  % ##2 is the entry label
+  % ##3 is the location list
+  \renewcommand{\subglossentry}[3]{%
+    & % Column 1 (Symbol) is left blank for sub-entries to create an indented look
+    \glssubentryitem{##2}% Inserts sub-entry number if subentrycounter is used
+    \glstarget{##2}{\strut}\glossentrydesc{##2}\glspostdescription & % Column 2: Description (target on strut for hyperlink)
+    ##3\tabularnewline % Column 3: Location list
+  }%
+  % Define the skip between letter groups (if group headings were enabled)
+  % For 3 columns, we need 2 ampersands for a full blank row if not using \multicolumn
+  \ifglsnogroupskip
+    \renewcommand*{\glsgroupskip}{}%
+  \else
+    \renewcommand*{\glsgroupskip}{& & \tabularnewline}%
+  \fi
+}
+
+% Define a new style 'supercol' based on 'super' for acronyms glossaries
+\newglossarystyle{supercol}{%
+  \setglossarystyle{super}% inherit everything from the original
+  % override just the main-entry format:
+  \renewcommand*{\glossentry}[2]{%
+    \glsentryitem{##1}%
+    \glstarget{##1}{\glossentryname{##1}}\space  % <-- added colon here
+    &: \glossentrydesc{##1}\glspostdescription\space ##2\tabularnewline
+  }%
+  % likewise for sub‐entries, if you want a colon there too:
+  \renewcommand*{\subglossentry}[3]{%
+    &: 
+    \glssubentryitem{##2}%
+    \glstarget{##2}{\strut}\glossentryname{##2}\space % <-- and here
+    \glossentrydesc{##2}\glspostdescription\space ##3\tabularnewline
+  }%
+}
+\makeatother
+
+% A new command that enables us to enter bi-lingual (Bahasa Indonesia and English) terms
+% syntax: \addterm[options]{label}{Bahasa Indonesia}{Bahasa Indonesia first use}{English}{Bahasa Indonesia
+% description}
+\newcommand{\addterm}[6][]{
+  \newglossaryentry{#2}{
+    name={#3 (angl.\ #5)},
+    first={#4 (\emph{#5})},
+    text={#3},
+    sort={#3},
+    description={#6},
+    #1 % pass additional options to \newglossaryentry
+  }
+}
+
+% A new command that enables us to enter (English) acronyms with bi-lingual
+% (Bahasa Indonesia and English) long versions
+% syntax: \addacronym[options]{label}{abbreviation}{Bahasa Indonesia long}{Bahasa Indonesia first
+% use long}{English long}{Bahasa Indonesia description}
+\newcommand{\addacronym}[7][]{
+  % Create the main glossary entry with \newacronym
+  % \newacronym[key-val list]{label}{abbrv}{long}
+  \newacronym[
+    name={#4 (angl.\ #6,\ #3)},
+    first={\emph{#5} (angl.\ \emph{#6},\ \emph{#3})},
+    sort={#4},
+    description={#7},
+    #1 % pass additional options to \newglossaryentry
+    ]
+    {#2}{#3}{#4}
+  % Create a cross-reference from the abbreviation to the main glossary entry by
+  % creating an auxiliary glossary entry (note: we set the label of this entry
+  % to '<original label>_auxiliary' to avoid clashes)
+  \newglossaryentry{#2_auxiliary}{
+    name={#3},
+    sort={#3},
+    description={\makefirstuc{#6}},
+    see=[See:]{#2}
+  }
+}
+
+% Change the text of the cross-reference links to the Bahasa Indonesia long version.
+\renewcommand*{\glsseeitemformat}[1]{\emph{\acrlong{#1}}.}
+
 % % Apply a custom fancyhdr layout only on the first page of each \chapter, and use no header/footer elsewhere
 % % \let\oldchapter\chapter
 % % \renewcommand{\chapter}{%
--- a/setup.py
+++ b/setup.py
@@ -0,0 +1,8 @@
+from setuptools import setup, find_packages
+
+setup(
+    name="thesisrepo",
+    version="0.1",
+    packages=find_packages(where="code"),
+    package_dir={"": "code"},
+)
Author	SHA1	Message	Date
nuluh	4b0819f94e	feat(notebooks): Enhance STFT notebook and model selection functionality - Updated paths in the STFT notebook to reflect new data files. - Improved plotting aesthetics for combined plots and added grid lines. - Introduced a 3D spectrogram visualization for better data representation. - Refactored model training function to include error handling and model export functionality. - Adjusted model training calls to include export paths for saved models. Closes #90 - Added additional markdown cells for better documentation and clarity in the notebook.	2025-06-12 03:35:21 +07:00
nuluh	f5dada1b9c	fix(latex): fix image path for flowchart in methodology section	2025-06-04 15:59:13 +07:00
nuluh	37c9a0765a	fix(documentclass): remove language option from biblatex package	2025-06-04 15:53:57 +07:00
nuluh	8656289a1c	chore(documentclass): comment out table of contents for temporary removal	2025-06-04 15:53:35 +07:00
nuluh	15fe8339ec	feat(documentclass): add new glossary for notation	2025-06-04 15:31:00 +07:00
nuluh	44210ef372	chore(latex): comment out maketitle inputs for temporary	2025-06-04 11:27:56 +07:00
nuluh	9192d4c81c	chore(documentclass): remove commented-out code for chapter formatting and header layout	2025-06-03 21:37:32 +07:00
nuluh	0373743ca7	fix(documentclass): enhance dot separation in ToC and add prefixes for figures and tables	2025-06-03 21:34:05 +07:00
nuluh	49d6395e6f	fix(documentclass): add missing \RequirePackage{titling} for maketitle formatting	2025-06-03 21:16:34 +07:00
nuluh	bf9cca2d90	feat(documentclass): redefine metadata information to main.tex by consdolidate internal command inside thesis.cls and remove metadata.tex Closes #96	2025-06-03 21:13:28 +07:00
nuluh	08420296e6	fix(documentclass): add missing \makeatother command to properly close the @ symbol	2025-06-03 20:59:11 +07:00
nuluh	1540213eec	feat(documentclass): add commands for bilingual terms and acronyms with custom glossary entries	2025-06-03 20:58:18 +07:00
nuluh	6fd4b7465e	feat(documentclass): add new glossary style 'supercol' for enhanced acronym formatting Closes #85	2025-06-03 20:55:26 +07:00
nuluh	85a0aebf36	feat(documentclass): add custom glossary style 'altlong3customheader' for notation with three-column layout Closes #95	2025-06-03 20:54:45 +07:00
nuluh	8d1edfdbf7	feat(glossaries): add glossary support with custom style for main glossaries entry and location header Closes 84	2025-06-03 20:52:54 +07:00
nuluh	ff862d9467	fix(documentclass): adjust page layout by increasing left margin to 4cm	2025-06-03 20:39:03 +07:00
nuluh	dfb64db1d8	feat(documentclass): add draft watermark and optional line numbering with 'draftmark' option	2025-06-03 20:37:29 +07:00
Rifqi D. Panuluh	3e3de577ba	Merge pull request #94 from nuluh/latex/91-bug-expose-maketitle Maketitle Replaced with \input for Flexibility when integrated with latexdiff-latexpand Workflow	2025-06-03 20:16:30 +07:00
nuluh	76a09c0219	refactor(documentclass): update title handling by using input files for maketitle Closes #91	2025-06-03 19:17:08 +07:00
nuluh	1a994fd59c	fix(documentclass): restore and customize English bibliography strings	2025-06-03 19:10:01 +07:00
nuluh	cdb3010b78	fix(documentclass): fix redefined bibliography strings error	2025-06-03 19:05:43 +07:00
nuluh	8a3c1ae585	refactor(main): comment out unused input sections and update chapter includes	2025-06-03 16:37:15 +07:00
nuluh	7b934d3fba	fix(acknowledgement): fix file naming	2025-06-03 15:02:12 +07:00
nuluh	aaccad7ae8	feat(glossaries): wip	2025-06-01 16:47:32 +07:00
Rifqi D. Panuluh	2c453ec403	Merge pull request #89 from nuluh/feature/88-refactor-training-cell Closes #88	2025-05-29 23:04:24 +07:00
nuluh	7da3179d08	refactor(nb): Create and implement helper function `train_and_evaluate_model`	2025-05-29 22:57:28 +07:00
nuluh	254b24cb21	feat(viz): Update plotting for STFT data visualization with color map 'jet' and added color bar	2025-05-29 20:35:35 +07:00
Rifqi D. Panuluh	d151062115	Add Working Milestone with Initial Results and Model Inference (#82 ) * wip: add function to create stratified train-test split from STFT data * feat(src): implement working function for dataset B to create ready data from STFT files stft_files and add setup.py for package configuration * feat(notebook): Update variable names for clarity, remove unused imports, and streamline data processing. Implement data concatenation using pandas concat for efficiency. Add validation steps for Dataset B and improve model training consistency across sensors. * fix(.gitignore): add rule to ignore egg-info directories and ensure proper formatting * docs(README): add instructions for running stft.ipynb notebook * feat(notebook): Add evaluation metrics and confusion matrix visualizations for model predictions on Dataset B. Remove commented-out code and integrate data preparation using create_ready_data function. --------- Co-authored-by: nuluh <dam.ar@outlook.com>	2025-05-24 01:30:10 +07:00