Skip to main content

Limits

The Speech Platform virtual appliance implements various internal limits to protect the internal system and to minimize the chance of issues and malfunctions. To modify the limits for your needs, you should understand the processing flow, the lifecycle of the processed media files and how the individual limits relate and affect each other.

The following schema shows the media processing flow and the parts where individual limits are applied. Further details are described below the picture.

Virtual Appliance limits settings

Media processing flow

DescriptionNote
1











Media files and microphone recordings uploaded or recorded via the web GUI are actually stored on your computer in your web browser's IndexedDB internal storage (i.e., they are NOT uploaded into the virtual appliance!).
To keep the local storage at a reasonable size and to not swamp your local disk, media files bigger than maxFileSize (default 100 MB) or attempts to upload more than maxFilesCount (default 100) are rejected. Similarly, microphone recordings with duration longer than maxVoiceRecorderDuration (default 5 minutes) are also rejected.
Since the uploaded files are to be also processed later, the audio duration limit maxAudioLength (default 2 hours) should be obeyed too, to prevent files being rejected later during processing.








2











When media file is uploaded to REST API - and also when processing is started from the web GUI (where file is retrieved from the browser storage and uploaded to the API) - the files uploaded to API are temporarily stored in the virtual appliance input storage, which has an inputStorageSize limit (default 1 GiB). If the input storage free space gets exceeded by the uploaded file, the API request fails.
To protect the API and to prevent stuck API requests, media files bigger than singleFileUploadSize (default 500 MB) or API upload requests taking longer than singleFileUploadTimeout (default 2 minutes) are rejected.
The larger files being processed and the more files being processed in parallel, the bigger inputStorageSize is required. At the same time, the size of the storages is limited by the free space on the virtual appliance data disk (/data/).







3












Source media files temporarily stored in input storage are then converted into unified audio format before processing.
To prevent an excessive memory usage during the conversion and to keep the size of the resulting converted files at a reasonable size, files with audio duration longer than maxAudioLength limit (default 2 hours) are rejected. The resulting converted files are stored in the virtual appliance internal storage, which has an internalStorageSize limit (default 2 GiB).
Source media files are automatically deleted after conversion (even if the conversion failed), to free up space in the input storage.
When processing lossy-compressed source media files (MP3, MP4, etc.), the resulting converted files may become VASTLY bigger - like, by the order of magnitude - than the originals, so depending on the format, encoding, duration, etc. of the source files, the internalStorageSize may need to be vastly increased. At the same time, the size of the storages is limited by the free space on the virtual appliance data disk (/data/).




4











Converted audio files are then read from the internal storage and processed by the technologies.
To prevent stuck processing requests, processing taking longer than taskGrpcTimeout (default 5 minutes) is cancelled.
Converted audio files are automatically deleted after processing (even if the processing failed), to free up space in the internal storage.




The processing time is affected by several factors:
  • the amount of speech in the audio (which is generally related to the duration of the audio)
  • the processing speed of a particular technology
  • whether the processing is done on CPU or GPU - especially technologies like Speech To Text Based On Whisper and Audio Manipulation Detection are very slow on CPU
Therefore, you may need to extend the taskGrpcTimeout according to your particular setup and conditions.

Further limits

API
The taskExpirationTime (default 5 minutes, maximum 60 minutes) is the period for which the API keeps information about a finished task. After that period, the information is automatically deleted. This is to prevent the internal system from cumulating the tasks info forever.
This means that clients should poll the API for task results in intervals smaller than the taskExpirationTime.

Web app GUI
The taskParallelism (default 4) is the maximum number of tasks running simultaneously.
The taskPollingInterval (default 1 second) is the interval in which the GUI app check the task status.
The taskPollingTimeout (default 1 hour) says how long keeps the GUI app checking the task status before giving up.
        (NOTE: Logically, the taskPollingTimeout should not be shorter than the taskGrpcTimeout).

Limits overview

API limits

NameUnitDefaultDescription
taskGrpcTimeoutseconds300Maximum time API waits for any task to complete.
If you process long audios, you may need to increase this limit.
taskExpirationTimeseconds600Time when finished tasks are expired.
API holds the information about finished tasks (both successfully finished and failed). This information is discarded after taskExpirationTime. Client usually polls on the task id. Client must retrieve the task status before it is expired.
Maximum value is 3600.
inputStorageSizevariable1GiBSize of the input storage.
When audio file is POSTed to the API, whole file must be stored on the disk.
If you process big files or multiple files in parallel, this limit probably needs to be increased.
internalStorageSizevariable2GiBSize of the internal storage.
Each audiofile is converted into wav format before processing. Converted audio is stored on the disk.
If you process big files or multiple files in parallel, this limit probably needs to be increased.
The internalStorageSize must be bigger or equal to the inputStorageSize.
singleFileUploadTimeoutsecond120Maximum allowed time for uploading single file to the API.
If you process big files or have a slow network connection, this limit needs to be increased.
singleFileUploadSizebytes524288000 (500 MB)Maximum allowed size of an audio file to upload.
If you process big files, this limit probably needs to be increased.
NOTE: When using the web app UI, make sure that this limit is not lower than maxFileSize limit!

Media Conversion limits

NameUnitDefaultDescription
maxAudioLengthseconds7200Audio length limit. Processing of media files with duration longer than this limit is rejected.

UI limits

NameUnitDefaultDescription
taskParallelism4UI post task to the API and polls for the task until it is finished. This controls how many tasks can be processed in parallel by the web app.
taskPollingIntervalseconds1Duration between poll attempts.
taskPollingTimeoutseconds3600How long the UI polls for the task. How long is the UI willing to wait until the task is finished.
maxFileSizebytes100000000 (100 MB)Maximum allowed size of an audio file to upload.
This must not be bigger than singleFileUploadSize API limit.
maxFilesCount100Maximum number of files to be uploaded.
maxVoiceRecorderDurationseconds300Maximum duration of the record captured by voice recorder.

How to change the limits

The limits are defined in a speech-platform-values.yaml configuration file placed in the /data/speech-platform/ directory.
To change a limit:

  1. Open the configuration file /data/speech-platform/speech-platform-values.yaml, e.g. using the File Browser
  2. Find a line with the limit you want to change (all limit definitions are located at the beginning of the file)
  3. Change the value and save the file

The system automatically recognizes that file was updated and redeploys itself with updated configuration.
To make sure that the configuration is valid and successfully applied, do the configuration file checks.

Example: Increasing the time to wait for a task to finish to 20 minutes (1200 seconds), which is useful for long-running tasks, e.g. Whisper-based transcription run on CPU

    ##############
# API Config #
##############
api:
config:
# When are finished tasks expired
taskExpirationTime: 600
# Maximum time in seconds to wait for the gRPC response
# Task must be processed within limit
taskGrpcTimeout: 1200
# Size of input storage
inputStorageSize: 1GiB

Advanced settings

Pods count limit

Currently, the platform is limited by the number of pods that can be created inside the Kubernetes cluster. The maximum number of pods is set to 300.

Pod count limits can be overridden by editing the /etc/rancher/k3s/config.yaml file. To override the maximum number of pods, the max-pods parameter needs to be added/edited.
Example: Increase the maximum number of pods to 350

debug: true
system-default-registry: airgapped.phonexia.com
disable:
- traefik
- cloud-controller
kubelet-arg:
- "kube-reserved=cpu=500m,memory=1Gi,ephemeral-storage=2Gi"
- "system-reserved=cpu=500m, memory=1Gi,ephemeral-storage=2Gi"
- "eviction-hard=memory.available<500Mi,nodefs.available<10%"
- "max-pods=350"

To apply the changes, virtual machine needs to be restarted (stop and start).

GPU sharing limit

This limit specifies number of pods which can share a single GPU.
The value must be higher than, or equal to, the number of technology microservices using the GPU, and not lower than 2.
Running the configuration script with the automatic configuration option sets the value automatically.

NameUnitDefaultDescription
replicascount-Number of pods sharing single GPU

How to change GPU sharing limits

  1. Open the configuration file /data/speech-platform/nvidia-device-plugin-configs.yaml, e.g. using the File Browser
  2. Find a line with the number of replicas
  3. Change the value and save the file

The system automatically recognizes that file was updated and redeploys itself with updated configuration.
To make sure that the configuration is valid and successfully applied, do the configuration file checks.

Example: Increase the number of pods sharing the GPU to 6

     resources:
- name: nvidia.com/gpu
replicas: 6

Disabling GPU sharing

In some cases it might be handy to disable GPU sharing:

  1. Open the configuration file /data/speech-platform/nvidia-device-plugin-configs.yaml, e.g. using the File Browser
  2. Locate key .data.default.sharing
  3. Delete all content under .data.default.sharing key and save the file

The system automatically recognizes that file was updated and redeploys itself with updated configuration.
To make sure that the configuration is valid and successfully applied, do the configuration file checks.

Example: Updated file should look similar this:

apiVersion: v1
kind: ConfigMap
metadata:
name: nvidia-device-plugin-configs
namespace: nvidia-device-plugin
data:
default: |-
version: v1
sharing: