Skip to main content

Deepfake Detection

Phonexia has developed an advanced Deepfake Detection technology designed to identify artificial voices within audio recordings, thereby enhancing the security and reliability of speaker verification systems. This approach leverages the transformed-based model and is primarily trained on datasets, which encompass a wide range of synthesized, converted, and replayed speech examples.

The technology incorporates self-supervised learning and data augmentation techniques. It has been trained on a large corpus of telephone data, resulting in fewer false alarms on telephone recordings. The model requires a minimum of 3 seconds of speech for inference.

Possible use cases

  1. Banks and Call Centers: Enhances the security of customer interactions by ensuring that communications are with legitimate individuals, thereby preventing fraudulent activities and unauthorized access.
  2. Forensic Analysis: Assists law enforcement agencies in authenticating audio evidence, ensuring its credibility in investigations and legal proceedings.

Scoring

Typical score values range from approximately -5 to 5. The score is calibrated such that 0 corresponds to the Equal Error Rate (EER) point on our evaluation datasets. The EER is the point where the false acceptance rate and false rejection rate are the same, ensuring a balanced trade-off between them.

Depending on the characteristics of your data and specific use case, you may need to adjust the decision threshold to achieve the desired balance between false positives and false negatives.