Deepfake Detection

This guide demonstrates how to perform Deepfake Detection with Phonexia Speech Platform 4.

The Deepfake Detection technology enables you to determine if audio data is genuine by identification of artificial voices within media files. We encourage you to read the high-level overview for Deepfake Detection to learn more about its features and capabilities.

Model version

Note that all example results were acquired by the specific model version and may change in future releases.

Deepfake Detection: beta:2.3.0

In the guide, we'll be using the following media files. You can download them all together in the audio_files.zip archive.

filename	LLR score
Bridget.wav	-3.5329
Graham.wav	-0.1516
Hans.wav	-1.5360

At the end of this guide, you'll find the full Python code example that combines all the steps that will first be discussed separately. This guide should give you a comprehensive understanding of how to integrate Deepfake Detection into your own projects.

Prerequisites

Follow the prerequisites for setup of Virtual Appliance and Python environment as described in the Task lifecycle code examples.

Run Deepfake Detection

To run Deepfake Detection for a single media file, you should start by sending a POST request to the /api/technology/deepfake-detection endpoint. file is the only mandatory parameter.

In Python, you can do this as follows:

import requests

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000"  # Replace with your address
MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/deepfake-detection"

media_file = "Bridget.wav"

with open(media_file, mode="rb") as file:
    files = {"file": file}
    start_task_response = requests.post(
        url=MEDIA_FILE_BASED_ENDPOINT_URL,
        files=files,
    )
print(start_task_response.status_code)  # Should print '202'

If the task has been successfully accepted, the 202 code will be returned together with a unique task ID in the response body. The task isn't processed immediately, but only scheduled for processing. You can check the current task status by polling for the result.

Polling

To obtain the final result, periodically query the task status until the task state changes to done, failed or rejected. The general polling procedure is described in detail in the Task lifecycle code examples.

Result for Deepfake Detection

The result field of the task contains channels list of independent results for each channel from the input media file.

For our sample data, the task result should look as follows:

{
  "task": {
    "task_id": "330f9d36-04e2-4b78-b4da-79bdd61aa7db",
    "state": "done"
  },
  "result": {
    "channels": [
      {
        "channel_number": 0,
        "score": -3.5329179763793945
      }
    ]
  }
}

The score represents a log-likelihood ratio (LLR), a real number ranging from -infinity to +infinity. The decision threshold is 0. Suspicious files should have a score higher than 0, while genuine files should have a score lower than 0.

The technology has a typical score range established with an evaluation dataset. While rare, scores may occasionally fall outside the typical range. Typical score ranges may change over time in future versions of the model. Consult the Typical score ranges may change over time in future model versions. Consult the technology documentation for details.

Score evaluation

The optimal decision threshold may differ from 0 depending on your application. To achieve the desired trade-off between false positives and false negatives, you may need to adjust the threshold based on your specific dataset and requirements.

Full Python code

Here is the full example of how to run the Deepfake Detection technology. The code is slightly adjusted and wrapped into functions for better readability. Refer to the Task lifecycle code examples for a generic code template, applicable to all technologies.

import json
import requests
import time

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000"  # Replace with your address

MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/deepfake-detection"


def poll_result(polling_url, polling_interval=5):
    """Poll the task endpoint until processing completes."""
    while True:
        polling_task_response = requests.get(polling_url)
        polling_task_response.raise_for_status()
        polling_task_response_json = polling_task_response.json()
        task_state = polling_task_response_json["task"]["state"]
        if task_state in {"done", "failed", "rejected"}:
            break
        time.sleep(polling_interval)
    return polling_task_response


def run_media_based_task(media_file, params=None, config=None):
    """Create a media-based task and wait for results."""
    if params is None:
        params = {}
    if config is None:
        config = {}

    with open(media_file, mode="rb") as file:
        files = {"file": file}
        start_task_response = requests.post(
            url=MEDIA_FILE_BASED_ENDPOINT_URL,
            files=files,
            params=params,
            data={"config": json.dumps(config)},
        )
        start_task_response.raise_for_status()
    polling_url = start_task_response.headers["Location"]
    task_result = poll_result(polling_url)
    return task_result.json()


# Run Deepfake Detection
media_files = ["Bridget.wav", "Graham.wav", "Hans.wav"]

for media_file in media_files:
    print(f"Running Deepfake Detection for file {media_file}.")
    media_file_based_task = run_media_based_task(media_file)
    media_file_based_task_result = media_file_based_task["result"]
    print(json.dumps(media_file_based_task_result, indent=2))

Prerequisites​

Run Deepfake Detection​

Polling​

Result for Deepfake Detection​

Full Python code​

Prerequisites

Run Deepfake Detection

Polling

Result for Deepfake Detection

Full Python code