The proliferation of Internet of Things (IoT) devices has significantly increased data traffic, necessitating robust security measures to protect against latent threats. Traditional anomaly detection methods often struggle to keep pace with the dynamic and diverse nature of IoT environments, particularly in use cases with limited accessible training data. This necessitates adaptive and efficient solutions. In this work in progress, we propose the use of fine-tuned Transformer models for anomaly detection in an IoT traffic use-case. These techniques allow adjusting a pre-trained model to a new domain where scarce training data is accessible. Specifically, we explore the usage of fine-tuning Transformer architectures using the CIC IoMT 2024 dataset and evaluate it with the Aposemat IoT-23 dataset. Compared to traditional machine learning techniques, our proposed approach demonstrates promising performance improvements, bridging the gap in developing novel Transformer-based architectures capable of providing supervised fraud detection capabilities, even with highly limited datasets for training.