Multimodal Large Language Models for Phishing Webpage Detection and Identification

Back

Published

Aug 12, 2024

Updated

Aug 12, 2024

Can AI Stop Phishing? Multimodal LLMs Show Promise

Multimodal Large Language Models for Phishing Webpage Detection and Identification

Jehyun Lee|Peiyuan Lim|Bryan Hooi|Dinil Mon Divakaran

https://arxiv.org/abs/2408.05941v1

Summary

Phishing attacks continue to be a pervasive threat in the digital landscape. With almost a million unique phishing web pages appearing in the first quarter of 2024 alone, traditional defenses are struggling to keep up. Blacklists are quickly outdated, and even sophisticated machine learning models can be easily evaded by increasingly sophisticated phishing techniques that mimic legitimate sites. Researchers are now turning to powerful multimodal Large Language Models (LLMs) to tackle this challenge. LLMs, trained on massive amounts of data, can analyze not just the text of a webpage, but also its visual elements like logos, themes, and favicons to identify the brand a page is impersonating. In a new study, researchers built a two-phase system using LLMs. The first phase uses the LLM to identify the brand being represented, while the second phase verifies whether the website domain matches the identified brand. This two-pronged approach helps distinguish between, say, a real PayPal page and a fake one hosted on a suspicious domain. The research team tested GPT-4, Gemini Pro 1.0, and Claude3, and found that all three showed promising results, particularly when provided with both visual screenshots and HTML data. GPT-4 and Claude3 performed especially well, achieving high accuracy in detecting phishing attempts while also offering explanations for their decisions. Remarkably, these LLM-based systems outperformed existing state-of-the-art brand-based phishing detectors, showing greater resilience against adversarial attacks designed to fool AI. While the use of LLMs for security introduces new opportunities, it also presents new challenges. The cost of using these powerful models for real-time analysis, the potential for economic denial-of-service attacks, and the risk of indirect prompt injection are all important considerations. The accessibility of these LLMs also means attackers can study them to develop new evasion techniques. However, this research highlights the potential of multimodal LLMs as a significant step forward in the fight against online phishing.

🍰 Interesting in building your own agents?

PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

How does the two-phase LLM system work to detect phishing websites?

The system operates through a sequential two-phase verification process. Phase 1: The LLM analyzes both visual elements (screenshots, logos, themes) and HTML data to identify the brand being represented on the webpage. Phase 2: The system verifies if the website's domain name legitimately matches the identified brand. For example, if the LLM identifies PayPal branding in Phase 1, but the domain is 'paypa1-secure.com' instead of 'paypal.com', the system would flag this as a potential phishing attempt. This approach has proven particularly effective because it combines both visual and textual analysis capabilities of multimodal LLMs like GPT-4 and Claude3.

What are the main benefits of AI-powered phishing detection for everyday internet users?

AI-powered phishing detection offers enhanced protection during daily online activities. It acts like a smart guardian that can instantly analyze websites you visit, checking not just the text but also visual elements like logos and layout to determine if they're genuine. The main benefits include real-time protection while shopping or banking online, reduced risk of falling for sophisticated scams, and peace of mind when entering sensitive information. For instance, it can warn you immediately if a website pretending to be your bank uses suspicious elements or domains, helping prevent financial fraud and identity theft before it happens.

How can businesses protect themselves from phishing attacks in 2024?

Businesses can implement a multi-layered approach to protect against phishing attacks. This includes deploying advanced AI-powered detection tools that can analyze both visual and textual elements of suspicious websites, providing regular security awareness training to employees, and maintaining updated email security protocols. It's also crucial to use strong authentication methods like 2FA, keep software and systems updated, and establish clear protocols for handling sensitive information. Regular security audits and penetration testing can help identify vulnerabilities before they're exploited by attackers.

PromptLayer Features

Testing & Evaluation
The paper's systematic evaluation of multiple LLMs' phishing detection capabilities aligns with PromptLayer's testing infrastructure

Implementation Details

Set up batch tests comparing LLM responses across different phishing scenarios, implement scoring metrics for accuracy, create regression tests for known phishing patterns

Key Benefits

• Standardized evaluation across multiple LLM models • Reproducible testing framework for phishing detection • Automated performance monitoring and comparison

Potential Improvements

• Add specialized phishing-specific scoring metrics • Implement adversarial test cases • Develop automated prompt optimization

Business Value

Efficiency Gains

Reduces manual testing effort by 70% through automated evaluation pipelines

Cost Savings

Optimizes LLM usage by identifying most cost-effective model for phishing detection

Quality Improvement

Ensures consistent detection accuracy through systematic testing

Analytics
Workflow Management
The two-phase detection system maps directly to PromptLayer's multi-step orchestration capabilities

Implementation Details

Create separate prompt templates for brand identification and domain verification, chain them in sequence, track versions of each component

Key Benefits

• Modular system design for easier maintenance • Version control for both phases • Reusable components for different security applications

Potential Improvements

• Add parallel processing capabilities • Implement feedback loops between phases • Create automated prompt updating system

Business Value

Efficiency Gains

Streamlines deployment and updates of multi-stage security systems

Cost Savings

Reduces development time by 40% through reusable components

Quality Improvement

Enables precise tracking and optimization of each detection phase

Can AI Stop Phishing? Multimodal LLMs Show Promise

Summary

Question & Answers

PromptLayer Features

The first platform built for prompt engineering