Skip to main content

Age Estimation

This guide demonstrates how to perform Age Estimation with Phonexia Speech Platform 4. You can find a high-level description in the Age Estimation article. The technology can estimate age of a speaker in media files or in voiceprints. This guide will show you how to do both.

For testing, we'll use media files in different languages and by both male and female speakers. You can download them all together in the audio_files.zip archive.

filenameagefilenameagefilenameage
Adedewe.wav46Lenka.wav32Tatiana.wav29
Dina.wav37Lubica.wav30Thida.wav32
Fadimatu.wav47Luka.wav31Tuan.wav29
Harry.wav43Nirav.wav27Xiang.wav36
Juan.wav27Noam.wav25Zoltan.wav31
Julia.wav26Obioma.wav36
note

Note that obtained age is approximate and is estimated within +/- 10 years precision, i.e. age - 10 <= age_estimated <= age + 10.

At the end of this guide, you'll find the full Python code example that combines all the steps that will first be discussed separately. This guide should give you a comprehensive understanding on how to integrate Age Estimation in your own projects.

Prerequisites

Follow the prerequisites for setup of Virtual Appliance and Python environment as described in the Task lifecycle code examples.

Run Age Estimation from file

To run Age Estimation for a single media file, you should start by sending a POST request to the /api/technology/age-estimation endpoint. file is the only mandatory parameter. In Python, you can do this as follows:

import requests

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address
MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation"

media_file = "Adedewe.wav"

with open(media_file, mode="rb") as file:
files = {"file": file}
start_task_response = requests.post(
url=MEDIA_FILE_BASED_ENDPOINT_URL,
files=files,
)
print(start_task_response.status_code) # Should print '202'

If the task has been successfully accepted, the 202 code will be returned together with a unique task ID in the response body. The task isn't processed immediately, but only scheduled for processing. You can check the current task status by polling for the result.

Polling

To obtain the final result, periodically query the task status until the task state changes to done, failed or rejected. The general polling procedure is described in detail in the Task lifecycle code examples.

Result for Age Estimation from media file

The result field of the task contains information about individual input media channels which can be identified by their channel_number. The speech_length field shows how much speech was used for producing the age estimation which is shown in the age field.

The following JSON shows the result of a successful Age Estimation task for the Adedewe.wav file which shows that the age was estimated as 46 (+/- 10) years.

{
"task": {
"task_id": "db414429-2b56-46b2-bf53-51e2a813b6da",
"state": "done"
},
"result": {
"channels": [
{
"channel_number": 0,
"speech_length": 30.88,
"age": 46
}
]
}
}

Run Age Estimation from voiceprints

Age Estimation can be performed on voiceprints extracted from media files with the Voiceprint Extraction technology. To run Voiceprint Extraction, follow the instructions in the Speaker Search technology guide.

For testing, we'll be using voiceprints extracted from the test recordings used in the Run Age Estimation from file section. You can find the voiceprints in the voiceprints.zip archive.

To run Age Estimation for a set of voiceprints, you should start by sending a POST request to the /api/technology/age-estimation-voiceprints endpoint. The list of voiceprints is the only mandatory field in the request body. In Python, you can do this as follows (each voiceprint is stored in a separate .vp file):

import requests

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address
VOICEPRINT_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation-voiceprints"

voiceprint_files = [
"Adedewe.vp",
"Dina.vp",
"Fadimatu.vp",
"Harry.vp",
"Juan.vp",
"Julia.vp",
"Lenka.vp",
"Lubica.vp",
"Luka.vp",
"Nirav.vp",
"Noam.vp",
"Obioma.vp",
"Tatiana.vp",
"Thida.vp",
"Tuan.vp",
"Xiang.vp",
"Zoltan.vp",
]

voiceprints = []
for voiceprint_file in voiceprint_files:
with open(voiceprint_file) as f:
voiceprints.append(f.read())

start_task_response = requests.post(
url=VOICEPRINT_BASED_ENDPOINT_URL,
json={"voiceprints": voiceprints},
)
print(start_task_response.status_code) # Should print '202'

Polling

To obtain the final result, periodically query the task status until the task state changes to done, failed or rejected. The general polling procedure is described in detail in the Task lifecycle code examples.

Result for Age Estimation from voiceprints

The result field of the task contains the following output (shortened for readability). The voiceprint_scores come in the same order as the input voiceprints. Notice that the result for Adedewe.vp is exactly the same as when estimated from media file. Refer to the Task lifecycle code examples for a generic code template, applicable to all technologies.

{
"task": {
"task_id": "26776c99-ac92-4df0-a0f3-3beb1498f4ee",
"state": "done"
},
"result": {
"voiceprint_scores": [
{
"speech_length": 30.88,
"age": 46
},
{
"speech_length": 19.52,
"age": 37
},
{
"speech_length": 23.84,
"age": 47
},
...
]
}
}

Full Python Code

Here is the full example on how to run the Age Estimation technology with both files and voiceprints as input data. The code is slightly adjusted and wrapped into functions. Refer to the Task lifecycle code examples for a generic code template, applicable to all technologies.

The scores_from_file.json and scores_from_voiceprints.json files contain the results of the Age Estimation. Notice that the results are identical except for the filename extensions (wav vs vp) and the extra channel_number information in scores_from_file.json.

import json
import requests
import time

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address

MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation"
VOICEPRINT_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation-voiceprints"


def poll_result(polling_url, polling_interval=5):
"""Poll the task endpoint until processing completes."""
while True:
polling_task_response = requests.get(polling_url)
polling_task_response.raise_for_status()
polling_task_response_json = polling_task_response.json()
task_state = polling_task_response_json["task"]["state"]
if task_state in {"done", "failed", "rejected"}:
break
time.sleep(polling_interval)
return polling_task_response


def run_media_based_task(media_file, params=None, config=None):
"""Create a media-based task and wait for results."""
if params is None:
params = {}
if config is None:
config = {}

with open(media_file, mode="rb") as file:
files = {"file": file}
start_task_response = requests.post(
url=MEDIA_FILE_BASED_ENDPOINT_URL,
files=files,
params=params,
data={"config": json.dumps(config)},
)
start_task_response.raise_for_status()
polling_url = start_task_response.headers["Location"]
task_result = poll_result(polling_url)
return task_result.json()


def run_voiceprint_based_task(json_payload):
"""Create a voiceprint-based task and wait for results."""
start_task_response = requests.post(
url=VOICEPRINT_BASED_ENDPOINT_URL,
json=json_payload,
)
start_task_response.raise_for_status()
polling_url = start_task_response.headers["Location"]
task_result = poll_result(polling_url)
return task_result.json()


# Run Age Estimation from media files
media_files = [
"Adedewe.wav",
"Dina.wav",
"Fadimatu.wav",
"Harry.wav",
"Juan.wav",
"Julia.wav",
"Lenka.wav",
"Lubica.wav",
"Luka.wav",
"Nirav.wav",
"Noam.wav",
"Obioma.wav",
"Tatiana.wav",
"Thida.wav",
"Tuan.wav",
"Xiang.wav",
"Zoltan.wav",
]

media_file_based_results = {}
for media_file in media_files:
print(f"Running Age Estimation for file {media_file}.")
media_file_based_task = run_media_based_task(media_file)
# The files are mono-channel, so we access the result in the first channel (index 0)
media_file_based_task_result = media_file_based_task["result"]["channels"][0]
media_file_based_results[media_file] = media_file_based_task_result
print(f"The result for {media_file} is: {media_file_based_task_result}")

# Save the results to a file
with open("scores_from_file.json", "w") as output_file:
json.dump(media_file_based_results, output_file, indent=2)


# Run Age Estimation from voiceprints
voiceprint_files = [
"Adedewe.vp",
"Dina.vp",
"Fadimatu.vp",
"Harry.vp",
"Juan.vp",
"Julia.vp",
"Lenka.vp",
"Lubica.vp",
"Luka.vp",
"Nirav.vp",
"Noam.vp",
"Obioma.vp",
"Tatiana.vp",
"Thida.vp",
"Tuan.vp",
"Xiang.vp",
"Zoltan.vp",
]

voiceprints = []
for voiceprint_file in voiceprint_files:
with open(voiceprint_file) as f:
voiceprints.append(f.read())

print(f"Running Age Estimation for {len(voiceprint_files)} voiceprints.")
voiceprint_based_task = run_voiceprint_based_task({"voiceprints": voiceprints})
voiceprint_based_task_result = voiceprint_based_task["result"]["voiceprint_scores"]

# Map the results to input voiceprint file names
results_per_voiceprint = {
filename: result
for filename, result in zip(voiceprint_files, voiceprint_based_task_result)
}
print(f"The results are: {results_per_voiceprint}")

# Save the results to a file
with open("scores_from_voiceprints.json", "w") as output_file:
json.dump(results_per_voiceprint, output_file, indent=2)