Glossary

Face Reenactment

Also: face puppeteering · expression transfer · FaceForensics

A deepfake technique where a driver video (usually the attacker's own face) controls the expressions, head pose, and gaze of a target identity in a generated video — enabling live impersonation.

Face reenactment is the technique that makes real-time deepfake video calls possible. Where face swaps replace the face in a pre-existing video, reenactment drives a target face live — the attacker performs; the target's face follows.

How it works

An identity encoder extracts the target's face from reference imagery (a single photo can be enough).
A driving signal — the attacker's own face, captured on a webcam — provides expressions, head pose, and gaze in real time.
A generator (usually a diffusion or GAN-based model) renders the target face performing the driver's movements.

The result: on a Zoom or Teams call, the attacker's camera shows the target's face, synchronized to the attacker's own speech and expressions.

The attack surface it opens

Real-time video call impersonation is the marquee threat. A cloned-voice TTS pipeline can pair with face reenactment to impersonate a known person on a video call — as happened in the Arup $25.6M case and several similar incidents since.

Detection signals

Reenactment tends to leave specific tells:

Pose limit. Most models degrade when the driver turns the head past ~45°.
Expression repertoire gaps. Extreme expressions (full laughs, scrunched concentration) can break the generator.
Lighting carry-over. The generated face may not update its lighting as the driver moves through different room conditions.
Per-frame identity drift. Small, high-frequency shifts in the rendered face identity frame-to-frame.

Liveness challenges that require unusual movements (turning full profile, covering part of the face, showing a specific object) are effective as a complement to deepfake detection.

How it works

The attack surface it opens

Detection signals

See also