Toward more generalized Malicious URL Detection Models

Bibliographic Details
Title:	Toward more generalized Malicious URL Detection Models
Authors:	Tsai, YunDa, Liow, Cayon, Siang, Yin Sheng, Lin, Shou-De
Publication Year:	2022
Collection:	ArXiv.org (Cornell University Library)
Subject Terms:	Computer Science - Machine Learning, Computer Science - Cryptography and Security
Description:	This paper reveals a data bias issue that can severely affect the performance while conducting a machine learning model for malicious URL detection. We describe how such bias can be identified using interpretable machine learning techniques, and further argue that such biases naturally exist in the real world security data for training a classification model. We then propose a debiased training strategy that can be applied to most deep-learning based models to alleviate the negative effects from the biased features. The solution is based on the technique of self-supervised adversarial training to train deep neural networks learning invariant embedding from biased data. We conduct a wide range of experiments to demonstrate that the proposed strategy can lead to significantly better generalization capability for both CNN-based and RNN-based detection models.
Document Type:	text
Language:	unknown
Relation:	http://arxiv.org/abs/2202.10027
Availability:	http://arxiv.org/abs/2202.10027
Accession Number:	edsbas.FF1FEAC4
Database:	BASE

Description
Description not available.