Emotion recognition helm chart
Phonexia emotion recognition microservice
Maintainers
| Name | Url | |
|---|---|---|
| Phonexia | support@phonexia.com | https://www.phonexia.com |
Helm: >= 3.2.0
Values
| Key | Type | Default | Description |
|---|---|---|---|
| affinity | object | {} | Affinity for pod assignment (node/pod affinity rules) |
| annotations | object | {} | Annotations of deployment (custom metadata for the deployment) |
| config.device | string | "cpu" | Compute device used for inference. Possible values: cpu, cuda. If you use cuda you have to use also image tag with gpu support |
| config.deviceIndex | string | nil | Device identifier. |
| config.instancesPerDevice | int | 1 | Number of instances per device (both CPU and GPU processing). Microservice can process requests concurrently if value is >1. |
| config.keepAliveTime | int | 60 | Time between 2 consecutive keep-alive messages, that are sent if there is no activity from the client. If set to 0, the default gRPC configuration (2hr) will be set (note, that this may get the microservice into unresponsive state). |
| config.keepAliveTimeout | int | 20 | Time to wait for keep alive acknowledgement until the connection is dropped by the server. |
| config.license.key | string | nil | License key name when using secret |
| config.license.secret | string | nil | Secret name containing the license |
| config.license.useSecret | bool | false | Get license from secret object (true) or use direct value (false) |
| config.license.value | string | "invalidLicenseKey" | Direct license key value (used when useSecret is false) |
| config.listeningAddress | string | "" | Address on which the server will be listening. Address '[::]' also accepts IPv4 connections. |
| config.logLevel | string | "" | Logging level. Possible values: error, warning, info, debug, trace. |
| config.model.file | string | "" | Name of a model file inside the volume (e.g., "generic-1.3.0.model") |
| config.model.subPath | string | "" | Subpath in volume where model is located |
| config.model.volume | object | {} | Volume configuration with Phonexia model (hostPath, PVC, etc.) |
| config.port | int | 8080 | Port where the service will listen (must match service.port) |
| config.threadsPerInstance | int | 1 | Number of threads per instance (applies to CPU processing only). Use N CPU threads in the microservice for each request. Number of threads is automatically detected if set to 0. |
| extraEnvVars | list | [] | Extra environment variables for image container |
| fullnameOverride | string | "" | String to fully override emotion-recognition.fullname template |
| global.image.registry | string | "" | Global image registry (overrides local image.registry and global.imageRegistry) For backward compatibility, if both global.imageRegistry and image.registry are set, image.registry takes precedence. |
| global.imagePullSecrets | list | [] | Global image pull secrets (overrides local imagePullSecrets) |
| global.imageRegistry | string | "" | Global image registry (overrides local image.registry) |
| image.pullPolicy | string | "IfNotPresent" | Image pull policy (Always, IfNotPresent, Never) |
| image.registry | string | "registry.cloud.phonexia.com" | Image registry URL |
| image.repository | string | "phonexia/dev/technologies/services-monorepo/emotion-recognition" | Image repository path |
| image.tag | string | "" | Image tag (defaults to appVersion from Chart.yaml) |
| imagePullSecrets | list | [] | Specify docker-registry secret names as an array |
| ingress.annotations | object | {} | Ingress annotations (e.g., nginx ingress class, TLS settings) |
| ingress.className | string | "" | Ingress class name (e.g., "nginx") |
| ingress.enabled | bool | false | Enable ingress resource creation |
| ingress.hosts | list | [{"host":"emotion-recognition.example.com","paths":[{"path":"/","pathType":"ImplementationSpecific"}]}] | Ingress host configuration |
| ingress.tls | list | [] | TLS configuration for ingress |
| initContainers | list | [] | Init containers (evaluated as template, can be used to fetch models) |
| livenessProbe | object | {"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1} | Liveness probe settings (checks if container is alive) |
| nameOverride | string | "" | String to partially override emotion-recognition.fullname template |
| nodeSelector | object | {} | Node labels for pod assignment (node selector) |
| onDemand.cooldownPeriod | int | 300 | Cooldown period in seconds after scaling down |
| onDemand.enabled | bool | false | Enable on-demand scaling with KEDA |
| onDemand.idleReplicaCount | int | 0 | Number of replicas when idle (usually 0 for cost savings) |
| onDemand.maxReplicaCount | int | 1 | Maximum number of replicas to scale up to |
| onDemand.minReplicaCount | int | 1 | Minimum number of replicas to maintain |
| onDemand.pollingInterval | int | 30 | How often KEDA checks metrics (seconds) |
| onDemand.trigger.activationThreshold | int | 5 | Threshold to activate scaling (minimum metric value to start scaling) |
| onDemand.trigger.query | string | "sum(increase(nginx_ingress_controller_requests{ exported_namespace=\"{{ .Release.Namespace }}\", exported_service=\"{{ include \"emotion-recognition.fullname\" . }}\", method=\"POST\"}[5m]))" | Prometheus query to determine scaling metrics |
| onDemand.trigger.serverAddress | string | "http://kube-prometheus-stack-prometheus.monitoring:9090/prometheus" | Prometheus server address for metrics collection |
| onDemand.trigger.threshold | int | 100 | Threshold value for scaling decisions |
| podAnnotations | object | {} | Annotations for pods (custom metadata for pods) |
| podSecurityContext | object | {} | Security context for pods (fsGroup, etc.) |
| readinessProbe | object | {"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1} | Readiness probe settings (checks if container is ready to serve traffic) |
| replicaCount | int | 1 | Number of replicas to deploy |
| resources | object | {} | The resources limits/requests for the emotion-recognition container |
| runtimeClassName | string | "" | Specify runtime class (e.g., for GPU nodes or specific container runtimes) |
| securityContext | object | {} | Security context for emotion-recognition container |
| service.clusterIP | string | "" | Service Cluster IP (use None for headless service) |
| service.port | int | 8080 | Service port (must match config.port) |
| service.type | string | "ClusterIP" | Service type (ClusterIP, NodePort, LoadBalancer) |
| serviceAccount.annotations | object | {} | Annotations to add to the service account |
| serviceAccount.create | bool | true | Specifies whether a service account should be created |
| serviceAccount.name | string | "" | The name of the service account to use |
| startupProbe | object | {"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1} | Startup probe settings (checks if container has started successfully) |
| tolerations | list | [] | Tolerations for pod assignment (allows pods on tainted nodes) |
| updateStrategy | object | {"type":"RollingUpdate"} | Deployment update strategy (RollingUpdate, Recreate) |
Installation
To successfully install the chart you have to obtain license and model at first. Service is unable to start without model and/or license. Feel free to contact phonexia support to obtain model and license for evaluation purpose.
Model
There are 2 ways how to pass a model to pods:
- Pass the model via initContainer
- Pass the model via volume
Pass the model via initContainer
With this approach no persistent volume is needed. InitContainer is added to the pod instead. It downloads model from specified location to ephemeral volume which is shared with main container. This happens each time when pod is re-deployed.
Following example shows how to do it in EKS.
In values file it looks like:
# Set config.model.volume to emptyDir
config:
model:
volume:
emptyDir: {}
file: "xl-5.0.0.model"
initContainers:
- name: init-copy-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/xl-5.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as in main container
- name: "PHX_MODEL_PATH"
value: "/models/{{ .Values.config.model.file }}"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
# Mount empty volume to initContainer
volumeMounts:
- name: '{{ include "emotion-recognition.fullname" . }}-models-volume'
mountPath: /models
Pass the model via volume
With this approach you need to create persistent volume, copy model there and mount it to pod.
Following example shows how to do it in EKS with EBS-based dynamic provisioning.
- Create persistentVolumeClaim
# filename: emotion-recognition.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: emotion-recognition
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ebs-sc
and apply it
kubectl apply -f emotion-recognition.yaml
- Create job which downloads model to persistent volume:
# filename: job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: emotion-recognition-download-model
spec:
template:
spec:
containers:
- name: download-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/xl-5.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as .Values.config.model.file in values files
- name: "PHX_MODEL_PATH"
value: "/models/xl-5.0.0.model"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
volumeMounts:
- name: persistent-storage
mountPath: /models
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: emotion-recognition
restartPolicy: Never
backoffLimit: 3
Apply it and wait until job is finished:
kubectl apply -f job.yaml
- Configure values file to use existing PVC:
config:
model:
# Volume with Phonexia model
volume:
persistentVolumeClaim:
claimName: emotion-recognition
# Name of a model file inside the volume, for example "xl-5.0.0.model"
file: "xl-5.0.0.model"
License
There are 2 ways how to pass license key to the chart:
- Pass the license key directly into values files
- Pass the license key via kubernetes secret
Pass the license key directly into values files
Use config.license.value to set license key in values file:
config:
license:
useSecret: false
value: "<license_key>"
Replace <license_key> with license key which is a long string, something like eyJ...ifQ==.
Pass the license key via kubernetes secret
Create kubernetes secret at first:
kubectl --namespace <my-namespace> create secret generic <my-secret> --from-literal=license=<license_key>
where
<my-namespace> is namespace where you plan to install the chart, my-secret is name of the secret to be created and <license_key> is actual license key.
In the end it should look like:
kubectl --namespace my-namespace create secret generic my-secret --from-literal=license=eyJ...ifQ==
Reference the secret in values file:
config:
license:
useSecret: true
secret: "my-secret"
key: "license"
Installing the Chart
When you have configure model and license you can proceed with installation itself.
Run the following command to install the chart with the release name my-release.
Use --version parameter to install specific version.
Available versions can be found on docker hub.
helm install my-release oci://registry-1.docker.io/phonexia/emotion-recognition --version 2.0.0-helm
This command deploys emotion-recognition on the Kubernetes cluster in the default configuration.
Exposing the service
To expose the service outside of kubernetes cluster follow Using a Service to Expose Your App.
Ingress
Emotion recognition service is using GRPC protocol which can be exposed by some ingress controllers. For example nginx-ingress controller support this. To expose emotion-recognition service via ingress use following configuration:
ingress:
# Deploy ingress object
enabled: true
# Ingress class name
className: "nginx"
annotations:
# Force redirect to SSL
nginx.ingress.kubernetes.io/ssl-redirect: "true"
# Tell nginx that backend service use GRPC
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
hosts:
# Hostnames
- host: emotion-recognition.example.com
paths:
- path: /
pathType: ImplementationSpecific
# Use tls
tls:
# Secret containing TLS certificate
- secretName: emotion-recognition-tls
# TLS hostnames
hosts:
- emotion-recognition.example.com
Use grpcurl to check if everything works as expected. Output of the following command
$ grpcurl --insecure emotion-recognition.example.com:443 grpc.health.v1.Health/Check
should be
{
"status": "SERVING"
}
Uninstalling the Chart
To uninstall/delete the my-release release:
helm delete my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.