Doors Anomaly Detection Repository

This repository provides a set of scripts to train anomaly detection models on door sensor data and then run inference (predict anomalies) on new data. The workflow relies on Python 3.11 and several common data science libraries. All dependencies and environment setup are managed by Poetry via a pyproject.toml file.

1. Repository Structure

.
├── Doors_example/                      # Dataset
│   ├── csv_labeled/                    # Folder for labeled CSV files
│   ├── csv_unlabeled/                  # Folder for unlabeled CSV files
│   ├── label_json_mini/                # Dataet in JSON format
│   └── doors_anomaly_detector_config   # Configuration file for system
├── my_utils.py                         # Common utility functions: CSV loading, resampling, wagon filtering, metrics plotting, etc.
├── my_anomaly_train.py                 # Main script to train anomaly detection models
├── my_anomaly_inference.py             # Main script to run anomaly detection inference using trained models
├── trained_models/                     # Directory where trained model files (.pkl) are saved
├── training_stats/                     # Directory for saving training-related statistics (if needed)
├── result/                             # Directory for saving inference results (.csv)
├── pyproject.toml                      # Poetry configuration for environment dependencies
└── README.md                           # This file

2. Environment Setup

Follow the steps below to create and activate the environment using Poetry:

# 1. Install Poetry if not already installed:
pip install poetry

# 2. Install dependencies from pyproject.toml:
poetry install

# 3. Activate the environment shell:
poetry shell

From within this Poetry environment, you can run all scripts without conflicts.

3. Training Models

my_anomaly_train.py handles reading multiple CSV files, optionally resampling them, filtering for a specific wagon, and training one or more anomaly detection models. The script:

Reads multiple CSV files (paths are currently hardcoded as examples).
Filters columns for a specific wagon if needed.
Resamples the data (e.g., every 10s).
Splits the data into train/validation.
Trains models (e.g., IsolationForest, OneClassSVM), either one per sensor or a single multivariate model.
Saves trained model files (.pkl) in trained_models/.

To run:

python my_anomaly_train.py

By default, you'll see multiple .pkl files in trained_models/ (e.g., model_one_class_svm_ALL.pkl or model_isolation_forest_01_1_0040.pkl).

4. Running Inference

my_anomaly_inference.py loads those .pkl model files from trained_models/, reads new CSV test data, does the same optional resampling/filtering, and applies each model to predict anomalies. The script:

Reads CSV files for test data (hardcoded in the script).
Resamples/filters similarly to training.
Loads each .pkl from trained_models/.
Produces a DataFrame with columns:
- <sensor_name>
- anomaly_score (the negative of decision_function, higher => more anomalous)
- anomaly_probability (a basic sigmoid transform of the score)
- prediction ("anomaly" or "normal")
If the test data includes an anomaly column, it will compute and optionally plot Precision/Recall/F1/etc.
Saves results to result/ as CSV.

To run:

python my_anomaly_inference.py

5. Where Outputs Go

Trained model files (.pkl) are saved in trained_models/.
Training stats (if any) go to training_stats/.
Inference results (.csv) appear in result/.

6. Customizations

Dates/paths: Hardcoded in my_anomaly_train.py and my_anomaly_inference.py. Change them to suit your data structure.
Rolling features: If use_features=True, a range of rolling stats and correlations is computed in _add_features(). Tweak the rolling windows or remove that logic to reduce complexity.
Individual vs. multivariate: If individual_model=True, you’ll get multiple .pkl files (one per sensor). If False, you get just one file per model type (ALL in the filename).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Doors_example		Doors_example
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
my_anomaly_inference.py		my_anomaly_inference.py
my_anomaly_train.py		my_anomaly_train.py
my_utils.py		my_utils.py
notebook.ipynb		notebook.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
report_last.pdf		report_last.pdf
subsystem_wagon_sensor.xlsx		subsystem_wagon_sensor.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Doors Anomaly Detection Repository

1. Repository Structure

2. Environment Setup

3. Training Models

4. Running Inference

5. Where Outputs Go

6. Customizations

About

Uh oh!

Releases

Packages

Uh oh!

Languages

altairje/Trainomaly

Folders and files

Latest commit

History

Repository files navigation

Doors Anomaly Detection Repository

1. Repository Structure

2. Environment Setup

3. Training Models

4. Running Inference

5. Where Outputs Go

6. Customizations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages