detect·deepfakesby Resemble AI
Deepfake case study · Multi-modal

The $25.6M Arup Deepfake Fraud (Hong Kong, 2024)

A finance employee at engineering firm Arup wired $25.6M after a multi-person video call where every participant except them was a deepfake. The case that made deepfake video-call fraud mainstream.

Incident date
Feb 2024
Target
Arup (global engineering firm)
Outcome
$25.6M lost across 15 transactions
Updated Apr 16, 2026 · 2 min read

In late January 2024, a finance employee at Arup's Hong Kong office received a message from someone they believed to be the CFO. The message asked them to join a video call to discuss a confidential transaction. On the call were several executives the employee recognized — by face and by voice. Over the following days, the employee executed 15 separate transfers totaling HK$200 million (~US$25.6M) to bank accounts the call participants provided.

Every participant on the video call except the employee was a deepfake.

What happened

The attack combined at least three synthesis techniques:

  • Face reenactment driving the CFO's face on the video call — likely pre-rendered rather than real-time, given the interaction style participants described.
  • Voice cloning for each executive speaking. The attackers had collected enough public audio of each target (earnings calls, interviews, conferences) to train convincing clones.
  • Coordinated script and social engineering. The "confidential" framing discouraged the employee from verifying through other channels.

The attackers had reconnaissance — they knew who to impersonate, which bank accounts to request transfers to, and the firm's internal communication norms well enough to pass.

Where detection would have caught it

A video deepfake detector running on the meeting recording would have flagged both tracks. Reenactment-driven face video carries per-frame identity drift that detection models pick up, and cloned voices have vocoder signatures distinct from the targets' real voices.

But detection at the call itself is harder. Real-time deepfake detection for live video calls is an emerging product category; most tools today operate on recorded files. Firms running deepfake-detection in 2026 mostly use it on downstream artifacts — recorded calls, forwarded evidence, submitted claim media — rather than at the meeting surface itself.

Organizational lessons

The Arup case reshaped enterprise fraud controls across sectors:

  • Out-of-band verification required for transfers above a firm-specific ceiling — regardless of who authorizes them on camera.
  • "Callback on a known-good number" policies became standard at most large banks and corporates.
  • Training programs reframed from "spot the deepfake" (which doesn't work — humans can't reliably do this) to "require verification regardless of conviction."
  • Video-call deepfake detection moved from exotic to standard procurement item in financial-services CISO roadmaps.

The underlying lesson

Detection is a layer. Procedural controls that don't depend on the employee being able to spot a deepfake — callback verification, multi-party approval, transfer holds — are what actually protect against this attack class. The deepfake is the vector; the control failure is the policy of trusting a voice over a video call.

Sources