When did the term "deepfake" originate?

In 2017, on Reddit, from a user who combined "deep learning" and "fake" to describe face-swapped videos. The term has since broadened to cover any AI-generated synthetic media.

Are all AI-generated images deepfakes?

The strict definition requires intent to deceive — a portrait generated from a text prompt for a magazine illustration isn't a deepfake. A generated image presented as evidence of an event that didn't happen is.

Is a deepfake illegal?

It depends on jurisdiction and content. Non-consensual intimate imagery is criminalized in a growing list of countries. Election-related deepfakes are restricted in others. See our country-by-country [regulation guide](/laws) for details (coming soon).

foundations

What Is a Deepfake? A Clear Definition

Deepfakes are AI-generated or AI-manipulated media — audio, images, or video — made to look or sound like something that didn't happen. A plain-language primer with examples.

Resemble AI·Mar 10, 2026·3 min read

A deepfake is AI-generated or AI-manipulated media — audio, image, or video — that portrays something that didn't happen, said by someone who didn't say it, or depicting someone who doesn't exist. The word combines deep learning (the kind of AI used to make them) and fake.

The term was coined in late 2017 by a Reddit user who used open-source machine-learning code to swap faces in videos. The technology has since become a multi-billion-dollar concern spanning fraud, disinformation, non-consensual imagery, and national security.

The three modalities

Deepfakes come in three main flavors, each with its own detection approach:

Audio deepfakes — synthetic voices made by voice cloning, text-to-speech, or voice-conversion systems. The voice sounds like a real person but was generated, often from just 30 seconds of reference audio. Used in CEO-fraud calls, fake voicemails, and consent attacks.
Image deepfakes — photos generated from text prompts by diffusion models (Stable Diffusion, Midjourney, DALL·E), or real photos modified via face swaps and inpainting. Used for non-consensual intimate imagery, disinformation, and synthetic identity documents.
Video deepfakes — the original meaning of the term. A real video with someone else's face, voice, or both substituted. Four sub-types: face swap, lip-sync, reenactment, and fully-synthetic video. See how to detect video deepfakes for details.

What isn't a deepfake

Scope matters. A few things the word is sometimes stretched to cover that probably shouldn't count:

Filters and beauty apps. Rule-based image transformations aren't usually called deepfakes.
CGI and visual effects. Movie VFX has used composited and synthetic faces for decades. The distinguishing feature of a deepfake is AI-driven automation at low cost.
AI-generated art presented as art. An AI-generated portrait labeled as such for a magazine cover is synthetic media, but it's not a deepfake — there's no deceptive intent.

The common thread in "deepfake" is intent to deceive. Synthetic media without deception is just content.

Why they're a problem

Three forces converged:

Quality went up. Face swaps in 2017 looked obviously wrong. In 2026 they fool most viewers most of the time.
Cost went down. Making a convincing voice clone now takes ~30 seconds of reference audio and a few dollars of compute. Video deepfakes are more expensive but falling fast.
Scale went up. Diffusion and voice-cloning pipelines run at scale, so attackers can probe thousands of targets concurrently.

This matters differently depending on the sector. Banking faces CEO-fraud calls; elections face narrative manipulation; insurance faces synthetic claim evidence. The threat model isn't universal.

What detection does

Detection models look at the statistical fingerprints that generation pipelines leave behind — frequency artifacts in images, phase inconsistencies in audio, temporal flickers in video — and estimate the probability that a piece of media came from a synthesis pipeline rather than a camera or microphone.

No detector is 100%. No detector can prove something is "real" in the strict sense — only that its fingerprint doesn't match known synthesis methods. The right use is as a probabilistic signal in a workflow that includes provenance, source context, and human review.

Try detection on a real file: audio, image, or video. Free, no signup.

Or build detection into your product directly. 50 free scans a month via the Resemble AI API.

Get API access