Limits
The Speech Platform virtual appliance implements various internal limits to protect the internal system and to minimize the chance of issues and malfunctions. To modify the limits for your needs, you should understand the processing flow, the lifecycle of the processed media files and how the individual limits relate and affect each other.
The following schema shows the media processing flow and the parts where individual limits are applied. Further details are described below the picture.
Media processing flow
Description | Note | |
---|---|---|
1 | Media files and microphone recordings uploaded or recorded via the web GUI are actually stored on your computer in your web browser's IndexedDB internal storage (i.e., they are NOT uploaded into the virtual appliance!). To keep the local storage at a reasonable size and to not swamp your local disk, media files bigger than maxFileSize (default 100 MB) or attempts to upload more than maxFilesCount (default 100) are rejected. Similarly, microphone recordings with duration longer than maxVoiceRecorderDuration (default 5 minutes) are also rejected. | Since the uploaded files are to be also processed later, the audio duration limit maxAudioLength (default 2 hours) should be obeyed too, to prevent files being rejected later during processing. |
2 | When media file is uploaded to REST API - and also when processing is started from the web GUI (where file is retrieved from the browser storage and uploaded to the API) - the files uploaded to API are temporarily stored in the virtual appliance input storage, which has an inputStorageSize limit (default 1 GiB). If the input storage free space gets exceeded by the uploaded file, the API request fails.To protect the API and to prevent stuck API requests, media files bigger than singleFileUploadSize (default 500 MB) or API upload requests taking longer than singleFileUploadTimeout (default 2 minutes) are rejected. | The larger files being processed and the more files being processed in parallel, the bigger inputStorageSize is required. At the same time, the size of the storages is limited by the free space on the virtual appliance data disk (/data/ ). |
3 | Source media files temporarily stored in input storage are then converted into unified audio format before processing. To prevent an excessive memory usage during the conversion and to keep the size of the resulting converted files at a reasonable size, files with audio duration longer than maxAudioLength limit (default 2 hours) are rejected. The resulting converted files are stored in the virtual appliance internal storage, which has an internalStorageSize limit (default 2 GiB).Source media files are automatically deleted after conversion (even if the conversion failed), to free up space in the input storage. | When processing lossy-compressed source media files (MP3, MP4, etc.), the resulting converted files may become VASTLY bigger - like, by the order of magnitude - than the originals, so depending on the format, encoding, duration, etc. of the source files, the internalStorageSize may need to be vastly increased. At the same time, the size of the storages is limited by the free space on the virtual appliance data disk (/data/ ). |
4 | Converted audio files are then read from the internal storage and processed by the technologies. To prevent stuck processing requests, processing taking longer than taskGrpcTimeout (default 5 minutes) is cancelled.Converted audio files are automatically deleted after processing (even if the processing failed), to free up space in the internal storage. | The processing time is affected by several factors:
taskGrpcTimeout according to your particular setup and conditions. |
Further limits
API
The taskExpirationTime
(default 5 minutes, maximum 60 minutes) is
the period for which the API keeps information about a finished task. After that
period, the information is automatically deleted. This is to prevent the
internal system from cumulating the tasks info forever.
This means that clients should poll the API for task results in intervals
smaller than the taskExpirationTime
.
Web app GUI
The taskParallelism
(default 4) is the maximum number of tasks running
simultaneously.
The taskPollingInterval
(default 1 second) is the interval in which the
GUI app check the task status.
The taskPollingTimeout
(default 1 hour) says how long keeps the GUI app
checking the task status before giving up.
(NOTE: Logically, the
taskPollingTimeout
should not be shorter than the taskGrpcTimeout
).
Limits overview
API limits
Name | Unit | Default | Description |
---|---|---|---|
taskGrpcTimeout | seconds | 300 | Maximum time API waits for any task to complete. If you process long audios, you may need to increase this limit. |
taskExpirationTime | seconds | 600 | Time when finished tasks are expired. API holds the information about finished tasks (both successfully finished and failed). This information is discarded after taskExpirationTime . Client usually polls on the task id. Client must retrieve the task status before it is expired.Maximum value is 3600 . |
inputStorageSize | variable | 1GiB | Size of the input storage. When audio file is POSTed to the API, whole file must be stored on the disk. If you process big files or multiple files in parallel, this limit probably needs to be increased. |
internalStorageSize | variable | 2GiB | Size of the internal storage. Each audiofile is converted into wav format before processing. Converted audio is stored on the disk. If you process big files or multiple files in parallel, this limit probably needs to be increased. The internalStorageSize must be bigger or equal to the inputStorageSize . |
singleFileUploadTimeout | second | 120 | Maximum allowed time for uploading single file to the API. If you process big files or have a slow network connection, this limit needs to be increased. |
singleFileUploadSize | bytes | 524288000 (500 MB) | Maximum allowed size of an audio file to upload. If you process big files, this limit probably needs to be increased. NOTE: When using the web app UI, make sure that this limit is not lower than maxFileSize limit! |
Media Conversion limits
Name | Unit | Default | Description |
---|---|---|---|
maxAudioLength | seconds | 7200 | Audio length limit. Processing of media files with duration longer than this limit is rejected. |
UI limits
Name | Unit | Default | Description |
---|---|---|---|
taskParallelism | 4 | UI post task to the API and polls for the task until it is finished. This controls how many tasks can be processed in parallel by the web app. | |
taskPollingInterval | seconds | 1 | Duration between poll attempts. |
taskPollingTimeout | seconds | 3600 | How long the UI polls for the task. How long is the UI willing to wait until the task is finished. |
maxFileSize | bytes | 100000000 (100 MB) | Maximum allowed size of an audio file to upload. This must not be bigger than singleFileUploadSize API limit. |
maxFilesCount | 100 | Maximum number of files to be uploaded. | |
maxVoiceRecorderDuration | seconds | 300 | Maximum duration of the record captured by voice recorder. |
How to change the limits
The limits are defined in a speech-platform-values.yaml
configuration file
placed in the /data/speech-platform/
directory.
To change a limit:
- Open the configuration file
/data/speech-platform/speech-platform-values.yaml
, e.g. using the File Browser - Find a line with the limit you want to change (all limit definitions are located at the beginning of the file)
- Change the value and save the file
The system automatically recognizes that file was updated and redeploys itself
with updated configuration.
To make sure that the configuration is valid and successfully applied, do the
configuration file checks.
Example: Increasing the time to wait for a task to finish to 20 minutes (1200 seconds), which is useful for long-running tasks, e.g. Whisper-based transcription run on CPU
##############
# API Config #
##############
api:
config:
# When are finished tasks expired
taskExpirationTime: 600
# Maximum time in seconds to wait for the gRPC response
# Task must be processed within limit
taskGrpcTimeout: 1200
# Size of input storage
inputStorageSize: 1GiB
Advanced settings
Pods count limit
Currently, the platform is limited by the number of pods that can be created
inside the Kubernetes cluster. The maximum number of pods is set to 300
.
Pod count limits can be overridden by editing the /etc/rancher/k3s/config.yaml
file. To override the maximum number of pods, the max-pods
parameter needs to
be added/edited.
Example: Increase the maximum number of pods to 350
debug: true
system-default-registry: airgapped.phonexia.com
disable:
- traefik
- cloud-controller
kubelet-arg:
- "kube-reserved=cpu=500m,memory=1Gi,ephemeral-storage=2Gi"
- "system-reserved=cpu=500m, memory=1Gi,ephemeral-storage=2Gi"
- "eviction-hard=memory.available<500Mi,nodefs.available<10%"
- "max-pods=350"
To apply the changes, virtual machine needs to be restarted (stop and start).
GPU sharing limit
This limit specifies number of pods which can share a single GPU.
The value must be higher than, or equal to, the number of technology
microservices using the GPU, and not lower than 2.
Running the configuration script with the automatic configuration option sets
the value automatically.
Name | Unit | Default | Description |
---|---|---|---|
replicas | count | - | Number of pods sharing single GPU |
How to change GPU sharing limits
- Open the configuration file
/data/speech-platform/nvidia-device-plugin-configs.yaml
, e.g. using the File Browser - Find a line with the number of
replicas
- Change the value and save the file
The system automatically recognizes that file was updated and redeploys itself
with updated configuration.
To make sure that the configuration is valid and successfully applied, do the
configuration file checks.
Example: Increase the number of pods sharing the GPU to 6
resources:
- name: nvidia.com/gpu
replicas: 6
Disabling GPU sharing
In some cases it might be handy to disable GPU sharing:
- Open the configuration file
/data/speech-platform/nvidia-device-plugin-configs.yaml
, e.g. using the File Browser - Locate key
.data.default.sharing
- Delete all content under
.data.default.sharing
key and save the file
The system automatically recognizes that file was updated and redeploys itself
with updated configuration.
To make sure that the configuration is valid and successfully applied, do the
configuration file checks.
Example: Updated file should look similar this:
apiVersion: v1
kind: ConfigMap
metadata:
name: nvidia-device-plugin-configs
namespace: nvidia-device-plugin
data:
default: |-
version: v1
sharing: