detect·deepfakesby Resemble AI
Glossary

Vishing

Also: voice phishing · vishing attack · voice scam

Vishing (voice phishing) is a social-engineering attack conducted over a phone call or voice channel. Attackers impersonate trusted parties — banks, IT support, family members, executives — to pressure the target into transferring money, handing over credentials, or authorizing actions. In 2026, AI voice cloning makes vishing attacks far more convincing and far cheaper to run at scale.

Vishing — short for voice phishing — is the phone-call version of phishing. An attacker calls the target, poses as a trusted party (bank, IRS, Microsoft support, the target's own CEO), and uses social engineering to extract money, credentials, or sensitive actions. It's been around since phones existed. What changed in 2024–2026 is that AI voice cloning made the "trusted party" voice convincing without any prior relationship with the caller.

How a vishing attack works

The anatomy of a modern vishing call:

  1. Reconnaissance. The attacker collects public audio of the target (if impersonating a named person) or picks a commonly-trusted role (bank fraud dept, IT help desk).
  2. Spoofing. The caller ID is spoofed to appear legitimate — the bank's real number, the boss's real mobile, a police department.
  3. Urgency framing. The script establishes time pressure: "Your account is being drained right now," "The CEO needs this transfer before close of business," "You're being investigated and need to act today."
  4. Extraction. Either a one-time transfer (wire, gift cards, crypto) or credential harvest (OTP codes, passwords, security questions).

AI voice cloning removes the biggest bottleneck — getting the victim to believe the voice. A 30-second reference clip pulled from LinkedIn, a podcast, or a YouTube interview is enough to build a convincing voice clone.

How widespread is it?

  • The FBI's IC3 attributed $12.5 billion in reported 2024 losses to internet crime, with vishing-adjacent categories growing faster than any other.
  • Sumsub measured a 245% YoY increase in deepfake-enabled fraud attempts in 2024.
  • The Arup $25.6M deepfake video call is the most visible case — but every large organization targeted at this scale has experienced smaller, similar attempts.
  • Mastercard found 46% of businesses have been targeted by a deepfake attack at least once.

Classic vishing vs. AI-vishing

Classic vishingAI-vishing (2024+)
VoiceHuman actor with accent controlAI-cloned target-specific voice
Cost per attackHours of human laborUnder $10 per attempt
ScaleOne victim per caller at a timeThousands of concurrent calls
ConvincingnessRelies on urgency + spoofed numberVictim recognizes the voice
DefenseCallback verificationCallback + voice-clone detection

How to detect vishing

Three layers of defense work together:

  1. Behavioral. Any unexpected call requesting urgent money movement, credentials, or out-of-band actions is suspicious — regardless of who it sounds like.
  2. Procedural. Call back on a known-good number. This defeats caller-ID spoofing and any voice clone the attacker produces. See the Ferrari case for how well this works in practice — the executive asked a question only the real CEO would know the answer to, and the attack collapsed.
  3. Technical. For organizations at scale, real-time audio deepfake detection integrated into the contact-center call flow catches AI-cloned callers before they reach a human agent. The banking playbook covers the integration pattern.

Practical resources