Phishing websites are developing as communication technology improves and are a cornerstone of internet criminal activities.
The spread of malicious Uniform Resource Locators (URLs) is one of the main methods used to conduct phishing attacks.
In addition to traditional defenses, such as blacklist filtering, educating users about characteristics of potentially malicious URLs has proved to be one of the most effective ways to prevent damage caused by phishing attacks. Inspired by the Generative Adversarial Network (GAN) method, this research aims to learn the patterns of phishing URLs and generate synthetic URLs, which could be applied for phishing attack education or training purposes. Our system includes two parts: a phishing URL generator and a detector.
During the training process, the generator learns to produce seemingly benign URLs according to feedback provided by the detector. These URLs are actually fake and could be used by phishers in phishing attacks. The detector learns to distinguish benign URLs from real phishing ones or synthetic ones. By applying the framework of GAN, these two parts co-evolved, improving through mutual feedback.