Referential deepfake detection
Phonexia referential-deepfake-detection is a tool for detecting deepfake audio by analyzing speech patterns and voice characteristics. It can perform deepfake detection by comparing reference and questioned audio samples. To learn more, visit the technology's home page.
Installation
- Docker image
- Docker compose
- Helm chart
Getting referential deepfake detection docker image
You can easily obtain the referential deepfake detection docker image from docker hub. Just run:
docker pull phonexia/referential-deepfake-detection:latest
Running the image
Docker
You can start the microservice and list all the supported options by running:
docker run --rm -it phonexia/referential-deepfake-detection:latest --help
The output should look like this:
referential_deepfake 2.0.0
referential_deepfake [OPTIONS]
OPTIONS:
-h, --help Print this help message and exit
--version Display program version information and exit
-m, --model file REQUIRED (Env:PHX_MODEL_PATH)
Path to a model file.
-k, --license_key string REQUIRED (Env:PHX_LICENSE_KEY)
License key.
-a, --listening_address address [[::]] (Env:PHX_LISTENING_ADDRESS)
Address on which the server will be listening. Address '[::]'
also accepts IPv4 connections.
-p, --port number [8080] (Env:PHX_PORT)
Port on which the server will be listening.
-l, --log_level level:{error,warning,info,debug,trace} [info] (Env:PHX_LOG_LEVEL)
Logging level. Possible values: error, warning, info, debug,
trace.
--keepalive_time_s number:[0, max_int] [60] (Env:PHX_KEEPALIVE_TIME_S)
Time between 2 consecutive keep-alive messages, that are sent if
there is no activity from the client. If set to 0, the default
gRPC configuration (2hr) will be set (note, that this may get the
microservice into unresponsive state).
--keepalive_timeout_s number:[1, max int] [20] (Env:PHX_KEEPALIVE_TIMEOUT_S)
Time to wait for keep alive acknowledgement until the connection
is dropped by the server.
--device TEXT:{cpu,cuda} [cpu] (Env:PHX_DEVICE)
Compute device used for inference.
--num_threads_per_instance NUM (Env:PHX_NUM_THREADS_PER_INSTANCE)
Number of threads per instance (applies to CPU processing only).
Use N CPU threads in the microservice for each request. Number of
threads is automatically detected if set to 0.
--num_instances_per_device NUM:UINT > 0 (Env:PHX_NUM_INSTANCES_PER_DEVICE)
Number of instances per device (both CPU and GPU processing).
Microservice can process requests concurrently if value is >1.
--device_index ID (Env:PHX_DEVICE_INDEX)
Device identifier
The model and license_key options are required. To obtain the model and license, contact Phonexia.
You can specify the options either via command line arguments or via environmental variables.
Run the container with the mandatory parameters:
docker run --rm -it -v /opt/phx/models:/models -p 8080:8080 phonexia/referential-deepfake-detection:latest --model /models/referential_deepfake_detection/xl5-1.1.0.model --license_key ${license-key}
The path /opt/phx/models is the location where models sent by Phonexia are stored on the host system.
This directory is mounted as a volume to /models inside the Docker container to make the models accessible to the microservice.
Replace the /opt/phx/models, xl5-1.1.0.model and license-key with the corresponding values.
With this command, the container will start, and the microservice will be listening on port 8080 on localhost.
Docker compose
Create a docker-compose.yml file:
version: '3'
services:
referential-deepfake-detection:
image: phonexia/referential-deepfake-detection:latest
environment:
- PHX_MODEL_PATH=/models/referential_deepfake_detection/xl5-1.1.0.model
- PHX_LICENSE_KEY=<license-key>
ports:
- 8080:8080
volumes:
- ./models:/models/
Create a models folder in the same directory as the docker-compose.yml file and place a folder with a model in it. Replace <license-key> with your license key and xl5-1.1.0.model with the actual name of a model.
The model and license_key options are required. To obtain the model and license, contact Phonexia.
You can than start the microservice by running:
$ docker compose up
The optimal way for large scale deployment is by using container orchestration system. Take a look at out Helm chart deployment page for deployment using Kubernetes.
Microservice communication
gRPC API
For communication, our microservices use gRPC, which is a high-performance, open-source Remote
Procedure Call (RPC) framework that enables efficient communication between distributed systems using a variety of programming languages. We use an interface definition language to specify a common interface and contracts between components. This is primarily achieved by specifying methods with parameters and return types.
Take a look at our gRPC API documentation. The referential-deepfake-detection microservice defines a ReferentialDeepfakeDetection service with two remote procedures:
Detect: Performs deepfake detection by comparing reference and questioned audio. This procedure accepts a streamedDetectRequestcontaining reference and questioned data, and returns aDetectResponsewith a detection score indicating the likelihood of the questioned audio being a deepfake.
Connecting to microservice
There are multiple ways how you can communicate with our microservices.
- Generated library
- Python client
- grpcurl client
- GUI clients
Using generated library
The most common way how to communicate with the microservices is via a programming language using a generated library.
Python library
If you use Python as your programming language, you can use our official gRPC Python library.
To install the package using pip, run:
pip install phonexia-grpc
You can then import:
- Specific libraries for each microservice that provide the message wrappers.
- stubs for the
gRPCclients.
from phonexia.grpc.common.core_pb2 import Audio
from phonexia.grpc.technologies.referential_deepfake_detection.v1.referential_deepfake_detection_pb2 import DetectRequest, DetectResponse
from phonexia.grpc.technologies.referential_deepfake_detection.v1.referential_deepfake_detection_pb2_grpc import ReferentialDeepfakeDetectionStub
Generate library for programming language of your choice
For the definition of microservice interfaces, we use the standard way of protocol buffers. The services, together with the procedures and messages that they expose, are defined in the so-called proto files.
The proto files can be used to generate client libraries in many programming languages. Take a look at protobuf tutorials for how to get started with generating the library in the languages of your choice using the protoc tool.
You can find the proto files developed by Phonexia in this repository.
Phonexia Python client
The easiest way to get started with testing is to use our simple Python client. To get it, run:
pip install phonexia-referential-deepfake-detection-client
Example of usage:
phonexia_referential_deepfake_detection_client -f example.wav
The example expects the microservice is running on localhost:8080.
The output will look like:
{
"result": {
"score": 0.85
}
}
You can see the help for more information about the client options:
referential_deepfake_detection_client --help
grpcurl client
If you need a simple tool for testing the microservice on cmd line, you can use grpcurl. This tool will serialize and send a body that we define in json to an endpoint that we specify.
For deepfake detection (Detect method), use audio for both reference and questioned data:
{
"audio_reference": {
"content": "${reference_audio_encoded_in_base64}"
},
"audio_questioned": {
"content": "${questioned_audio_encoded_in_base64}"
}
}
Note that in grpcurl, the audio content must be encoded in base64.
You can create request bodies with the following commands:
echo -n '{"audio_reference": {"content": "'$(cat ${path_to_audio_reference} | base64 -w0)'"}, "audio_questioned": {"content": "'$(cat ${path_to_audio_questioned} | base64 -w0)'"}}' > ${path_to_body}
Replace path_s with corresponding values.
Now you can make the request. The microservice supports reflection. That means that you don't need to know the API in advance to make a request.
grpcurl -plaintext -use-reflection -d @ localhost:8080 phonexia.grpc.technologies.referential_deepfake_detection.v1.ReferentialDeepfakeDetection/Detect < ${path_to_body}
The grpcurl should automatically serialize the response to this query into JSON.
GUI clients
If you'd prefer to use a GUI client like Postman or Warthog to test the microservice, take a look at the GUI Client page in our documentation. Note that you will still need to convert the audio into the Base64 format manually as those tools do not support it by default either.
Further links
- Maintained by Phonexia
- Contact us via e-mail, or using Phonexia Service Desk
- File an issue
- See list of licenses
- See terms of use
Versioning
We use Semantic Versioning.