Age Estimation
This guide demonstrates how to perform Age Estimation with Phonexia Speech Platform 4. You can find a high-level description in the Age Estimation article. The technology can estimate age of a speaker in media files or in voiceprints. This guide will show you how to do both.
For testing, we'll use media files in different languages and by both male and female speakers. You can download them all together in the audio_files.zip archive.
| filename | age | filename | age | filename | age |
|---|---|---|---|---|---|
| Adedewe.wav | 46 | Lenka.wav | 32 | Tatiana.wav | 29 |
| Dina.wav | 37 | Lubica.wav | 30 | Thida.wav | 32 |
| Fadimatu.wav | 47 | Luka.wav | 31 | Tuan.wav | 29 |
| Harry.wav | 43 | Nirav.wav | 27 | Xiang.wav | 36 |
| Juan.wav | 27 | Noam.wav | 25 | Zoltan.wav | 31 |
| Julia.wav | 26 | Obioma.wav | 36 |
Note that obtained age is approximate and is estimated within +/- 10 years
precision, i.e. age - 10 <= age_estimated <= age + 10.
At the end of this guide, you'll find the full Python code example that combines all the steps that will first be discussed separately. This guide should give you a comprehensive understanding on how to integrate Age Estimation in your own projects.
Prerequisites
Follow the prerequisites for setup of Virtual Appliance and Python environment as described in the Task lifecycle code examples.
Run Age Estimation from file
To run Age Estimation for a single media file, you should start by sending a
POST request to the
/api/technology/age-estimation
endpoint. file is the only mandatory parameter. In Python, you can do this as
follows:
import requests
VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address
MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation"
media_file = "Adedewe.wav"
with open(media_file, mode="rb") as file:
files = {"file": file}
start_task_response = requests.post(
url=MEDIA_FILE_BASED_ENDPOINT_URL,
files=files,
)
print(start_task_response.status_code) # Should print '202'
If the task has been successfully accepted, the 202 code will be returned
together with a unique task ID in the response body. The task isn't
processed immediately, but only scheduled for processing. You can check the
current task status by polling for the result.
Polling
To obtain the final result, periodically query the task status until the task
state changes to done, failed or rejected. The general polling procedure
is described in detail in the
Task lifecycle code examples.
Result for Age Estimation from media file
The result field of the task contains information about individual input media
channels which can be identified by their channel_number. The speech_length
field shows how much speech was used for producing the age estimation which is
shown in the age field.
The following JSON shows the result of a successful Age Estimation task for the
Adedewe.wav file which shows that the age was estimated as 46 (+/- 10) years.
{
"task": {
"task_id": "db414429-2b56-46b2-bf53-51e2a813b6da",
"state": "done"
},
"result": {
"channels": [
{
"channel_number": 0,
"speech_length": 30.88,
"age": 46
}
]
}
}
Run Age Estimation from voiceprints
Age Estimation can be performed on voiceprints extracted from media files with the Voiceprint Extraction technology. To run Voiceprint Extraction, follow the instructions in the Speaker Search technology guide.
For testing, we'll be using voiceprints extracted from the test recordings used in the Run Age Estimation from file section. You can find the voiceprints in the voiceprints.zip archive.
To run Age Estimation for a set of voiceprints, you should start by sending a
POST request to the
/api/technology/age-estimation-voiceprints
endpoint. The list of voiceprints is the only mandatory field in the request
body. In Python, you can do this as follows (each voiceprint is stored in a
separate .vp file):
import requests
VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address
VOICEPRINT_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation-voiceprints"
voiceprint_files = [
"Adedewe.vp",
"Dina.vp",
"Fadimatu.vp",
"Harry.vp",
"Juan.vp",
"Julia.vp",
"Lenka.vp",
"Lubica.vp",
"Luka.vp",
"Nirav.vp",
"Noam.vp",
"Obioma.vp",
"Tatiana.vp",
"Thida.vp",
"Tuan.vp",
"Xiang.vp",
"Zoltan.vp",
]
voiceprints = []
for voiceprint_file in voiceprint_files:
with open(voiceprint_file) as f:
voiceprints.append(f.read())
start_task_response = requests.post(
url=VOICEPRINT_BASED_ENDPOINT_URL,
json={"voiceprints": voiceprints},
)
print(start_task_response.status_code) # Should print '202'
Polling
To obtain the final result, periodically query the task status until the task
state changes to done, failed or rejected. The general polling procedure
is described in detail in the
Task lifecycle code examples.
Result for Age Estimation from voiceprints
The result field of the task contains the following output (shortened for
readability). The voiceprint_scores come in the same order as the input
voiceprints. Notice that the result for Adedewe.vp is exactly the same as
when estimated from media file.
Refer to the
Task lifecycle code examples
for a generic code template, applicable to all technologies.
{
"task": {
"task_id": "26776c99-ac92-4df0-a0f3-3beb1498f4ee",
"state": "done"
},
"result": {
"voiceprint_scores": [
{
"speech_length": 30.88,
"age": 46
},
{
"speech_length": 19.52,
"age": 37
},
{
"speech_length": 23.84,
"age": 47
},
...
]
}
}
Full Python Code
Here is the full example on how to run the Age Estimation technology with both files and voiceprints as input data. The code is slightly adjusted and wrapped into functions. Refer to the Task lifecycle code examples for a generic code template, applicable to all technologies.
The scores_from_file.json and
scores_from_voiceprints.json files contain the
results of the Age Estimation. Notice that the results are identical except for
the filename extensions (wav vs vp) and the extra channel_number
information in scores_from_file.json.
import json
import requests
import time
VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000" # Replace with your address
MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation"
VOICEPRINT_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/age-estimation-voiceprints"
def poll_result(polling_url, polling_interval=5):
"""Poll the task endpoint until processing completes."""
while True:
polling_task_response = requests.get(polling_url)
polling_task_response.raise_for_status()
polling_task_response_json = polling_task_response.json()
task_state = polling_task_response_json["task"]["state"]
if task_state in {"done", "failed", "rejected"}:
break
time.sleep(polling_interval)
return polling_task_response
def run_media_based_task(media_file, params=None, config=None):
"""Create a media-based task and wait for results."""
if params is None:
params = {}
if config is None:
config = {}
with open(media_file, mode="rb") as file:
files = {"file": file}
start_task_response = requests.post(
url=MEDIA_FILE_BASED_ENDPOINT_URL,
files=files,
params=params,
data={"config": json.dumps(config)},
)
start_task_response.raise_for_status()
polling_url = start_task_response.headers["Location"]
task_result = poll_result(polling_url)
return task_result.json()
def run_voiceprint_based_task(json_payload):
"""Create a voiceprint-based task and wait for results."""
start_task_response = requests.post(
url=VOICEPRINT_BASED_ENDPOINT_URL,
json=json_payload,
)
start_task_response.raise_for_status()
polling_url = start_task_response.headers["Location"]
task_result = poll_result(polling_url)
return task_result.json()
# Run Age Estimation from media files
media_files = [
"Adedewe.wav",
"Dina.wav",
"Fadimatu.wav",
"Harry.wav",
"Juan.wav",
"Julia.wav",
"Lenka.wav",
"Lubica.wav",
"Luka.wav",
"Nirav.wav",
"Noam.wav",
"Obioma.wav",
"Tatiana.wav",
"Thida.wav",
"Tuan.wav",
"Xiang.wav",
"Zoltan.wav",
]
media_file_based_results = {}
for media_file in media_files:
print(f"Running Age Estimation for file {media_file}.")
media_file_based_task = run_media_based_task(media_file)
# The files are mono-channel, so we access the result in the first channel (index 0)
media_file_based_task_result = media_file_based_task["result"]["channels"][0]
media_file_based_results[media_file] = media_file_based_task_result
print(f"The result for {media_file} is: {media_file_based_task_result}")
# Save the results to a file
with open("scores_from_file.json", "w") as output_file:
json.dump(media_file_based_results, output_file, indent=2)
# Run Age Estimation from voiceprints
voiceprint_files = [
"Adedewe.vp",
"Dina.vp",
"Fadimatu.vp",
"Harry.vp",
"Juan.vp",
"Julia.vp",
"Lenka.vp",
"Lubica.vp",
"Luka.vp",
"Nirav.vp",
"Noam.vp",
"Obioma.vp",
"Tatiana.vp",
"Thida.vp",
"Tuan.vp",
"Xiang.vp",
"Zoltan.vp",
]
voiceprints = []
for voiceprint_file in voiceprint_files:
with open(voiceprint_file) as f:
voiceprints.append(f.read())
print(f"Running Age Estimation for {len(voiceprint_files)} voiceprints.")
voiceprint_based_task = run_voiceprint_based_task({"voiceprints": voiceprints})
voiceprint_based_task_result = voiceprint_based_task["result"]["voiceprint_scores"]
# Map the results to input voiceprint file names
results_per_voiceprint = {
filename: result
for filename, result in zip(voiceprint_files, voiceprint_based_task_result)
}
print(f"The results are: {results_per_voiceprint}")
# Save the results to a file
with open("scores_from_voiceprints.json", "w") as output_file:
json.dump(results_per_voiceprint, output_file, indent=2)