Vishing
Vishing (voice phishing) is a social-engineering attack conducted over a phone call or voice channel. Attackers impersonate trusted parties — banks, IT support, family members, executives — to pressure the target into transferring money, handing over credentials, or authorizing actions. In 2026, AI voice cloning makes vishing attacks far more convincing and far cheaper to run at scale.
Vishing — short for voice phishing — is the phone-call version of phishing. An attacker calls the target, poses as a trusted party (bank, IRS, Microsoft support, the target's own CEO), and uses social engineering to extract money, credentials, or sensitive actions. It's been around since phones existed. What changed in 2024–2026 is that AI voice cloning made the "trusted party" voice convincing without any prior relationship with the caller.
How a vishing attack works
The anatomy of a modern vishing call:
- Reconnaissance. The attacker collects public audio of the target (if impersonating a named person) or picks a commonly-trusted role (bank fraud dept, IT help desk).
- Spoofing. The caller ID is spoofed to appear legitimate — the bank's real number, the boss's real mobile, a police department.
- Urgency framing. The script establishes time pressure: "Your account is being drained right now," "The CEO needs this transfer before close of business," "You're being investigated and need to act today."
- Extraction. Either a one-time transfer (wire, gift cards, crypto) or credential harvest (OTP codes, passwords, security questions).
AI voice cloning removes the biggest bottleneck — getting the victim to believe the voice. A 30-second reference clip pulled from LinkedIn, a podcast, or a YouTube interview is enough to build a convincing voice clone.
How widespread is it?
- The FBI's IC3 attributed $12.5 billion in reported 2024 losses to internet crime, with vishing-adjacent categories growing faster than any other.
- Sumsub measured a 245% YoY increase in deepfake-enabled fraud attempts in 2024.
- The Arup $25.6M deepfake video call is the most visible case — but every large organization targeted at this scale has experienced smaller, similar attempts.
- Mastercard found 46% of businesses have been targeted by a deepfake attack at least once.
Classic vishing vs. AI-vishing
| Classic vishing | AI-vishing (2024+) | |
|---|---|---|
| Voice | Human actor with accent control | AI-cloned target-specific voice |
| Cost per attack | Hours of human labor | Under $10 per attempt |
| Scale | One victim per caller at a time | Thousands of concurrent calls |
| Convincingness | Relies on urgency + spoofed number | Victim recognizes the voice |
| Defense | Callback verification | Callback + voice-clone detection |
How to detect vishing
Three layers of defense work together:
- Behavioral. Any unexpected call requesting urgent money movement, credentials, or out-of-band actions is suspicious — regardless of who it sounds like.
- Procedural. Call back on a known-good number. This defeats caller-ID spoofing and any voice clone the attacker produces. See the Ferrari case for how well this works in practice — the executive asked a question only the real CEO would know the answer to, and the attack collapsed.
- Technical. For organizations at scale, real-time audio deepfake detection integrated into the contact-center call flow catches AI-cloned callers before they reach a human agent. The banking playbook covers the integration pattern.
Related terms
- Voice phishing — longer-form synonym used in some regulatory contexts
- Smishing vs vishing — SMS-based version and how they compare
- Voice cloning — the technology that powers modern vishing
- Deepfake — umbrella term for AI-manipulated media
- Presentation attack — the broader biometric-attack class
Practical resources
- Free audio deepfake detector — upload a suspicious voicemail or recorded call
- How to survive a deepfake scam call
- Banking deepfake detection playbook