detect·deepfakesby Resemble AI
Glossary

Liveness Detection

Also: liveness check · presentation attack detection · PAD

A class of biometric verification techniques that confirm the input comes from a live, present human — distinguishing a real face or voice from a photo, recording, mask, or deepfake presented to the sensor.

Liveness detection answers a specific question: "is the person in front of the sensor actually a live human being present at this moment?" It's the defense against presentation attacks — where an attacker shows a photo, plays a recording, wears a mask, or pipes in a deepfake to defeat biometric authentication.

Deepfakes didn't invent the presentation attack (a printed photo held up to a face recognition camera works in a pinch), but they've made scalable, high-quality presentation attacks cheap. Liveness detection is the layer that catches them.

Approaches

Liveness detection falls into three broad families:

Active liveness — the sensor asks the user to do something: blink, smile, turn their head, say a random phrase. The system verifies the specific action was performed in real-time. Strong, but adds UX friction.

Passive liveness — the sensor analyzes the input for signals of a real human present: depth cues in a 2D photo, micro-movements that a still image can't reproduce, camera-specific noise patterns. The user does nothing special. Lower friction, but generally weaker than active on sophisticated attacks.

Hardware liveness — uses sensors that a simple presentation can't fake: structured-light depth (iPhone Face ID), time-of-flight, multi-spectral cameras, or physiological sensors. Strongest; hardware cost limits deployment.

Voice liveness

For voice authentication, liveness means distinguishing a speaker talking into a microphone right now from a voice clone or a replay of an earlier recording. Signals include:

  • Challenge-response phrases — the system prompts a random sentence; a pre-recorded clone can't produce it.
  • Room acoustics — real voices carry consistent room tone and microphone characteristics; synthesized voices often don't.
  • Breathing and articulatory micro-timing — harder to fake in real-time than in an offline synthesis pipeline.

Where liveness sits in a stack

Liveness detection is one layer in a complete identity system. A realistic deployment:

  1. Biometric match (face or voice matches the enrolled identity).
  2. Liveness check (the biometric input is from a live human, not a presentation).
  3. Deepfake detection (the biometric input hasn't been synthesized, even if it looks live).
  4. Behavioral signals (device, location, typing pattern, recency of interaction).

Each layer addresses different attack types. Deepfake detection catches attacks that defeat liveness (e.g., a real-time face-swap feed into a video call). Liveness catches attacks that defeat biometric match by presenting a recording.

Detection implications

No single technique is sufficient against modern attacks. Best-in-class identity platforms combine multiple liveness signals with deepfake detection as a parallel check. See how banking teams are thinking about this for a real-world pipeline.

See also