Skip to main content
Article

Addressing class imbalance in VeReMi dataset: A CGAN-driven approach

Author
  • Addressing class imbalance in VeReMi dataset: A CGAN-driven approach

    Article

    Addressing class imbalance in VeReMi dataset: A CGAN-driven approach

    Author

Abstract

In vehicular ad-hoc networks (VANETs), ensuring the reliability of vehicle-to-vehicle communications, requires robust intrusion detection systems (IDS) capable of identifying malicious behaviours. One of the main challenges in developing reliable solutions for vehicular networks is identifying a representative dataset that comprehensively captures all types of misbehaviours. Many existing datasets either lack diversity, are outdated, or do not accurately reflect real-world conditions. This makes it difficult to ensure that all potential attack scenarios are properly represented and evaluated, ultimately impacting the reliability and generalisation of the developed approaches. 

The VeReMi dataset is a widely recognised benchmark that attempts to address these issues, featuring various simulated misbehaviours such as Constant Position, Constant Offset, Random Position, Random Offset, and Eventual Stop attacks. While attacks like Constant and Random Positions are relatively easy to detect due to their erratic nature, more subtle attacks, including Constant Offset and Eventual Stop, closely mimic legitimate behaviours, making detection significantly more challenging. Additionally, the dataset’s naturally imbalanced distribution, where attack instances are vastly outnumbered by benign traffic, further complicates detection.

To address this imbalance, the project focuses exclusively on oversampling techniques to enhance minority class representation, deliberately avoiding undersampling to preserve valuable benign data. Conventional methods, such as Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN), improve class balance but often fail to capture the underlying complexity of subtle attacks. In contrast, more recent approaches based on Generative Adversarial Networks (GANs), including Conditional GANs (CGANs) and Wasserstein GANs (WGANs), demonstrate superior capabilities in synthesising realistic and diverse minority instances. Leveraging these advances, the project adopts CGAN-based oversampling to better model the nuanced behaviours present in vehicular networks, thereby enhancing the IDS’s ability to detect both obvious and subtle misbehaviours with greater generalisation and robustness. In doing so, it 
aims to enhance intrusion detection systems with improved generalisation and robustness, ultimately supporting researchers in developing more scalable and effective solutions for real-world VANET environments.

Keywords: vehicular ad-hoc networks, vehicle-to-vehicle communications, VeReMi dataset, Conditional GANs

How to Cite:

Sbai, F., (2026) “Addressing class imbalance in VeReMi dataset: A CGAN-driven approach”, New Vistas 12(1). doi: https://doi.org/10.36828/newvistas.399

Downloads:
Download Image

3 Views

1 Downloads

Published on
2026-05-21

Image