Helm Installation
The current Spring Cloud Data Flow chart is based on Helm 2. The Helm project will be ending support for Helm 2 in November of 2020. At that time the Spring Cloud Data Flow chart will be based on Helm 3, dropping support for Helm 2.
Migration steps from Helm 2 to Helm 3 are required. In preparation for the migration, it is advised to read the Helm v2 to v3 Migration Guide for more information. Additionally, some helpful tips on data migration and upgrades can be found in the post migration issues article.
Spring Cloud Data Flow offers a Helm Chart for deploying the Spring Cloud Data Flow server and its required services to a Kubernetes Cluster.
The following sections cover how to initialize Helm and install Spring Cloud Data Flow on a Kubernetes cluster.
If using Minikube, see Setting Minikube Resources for details on CPU and RAM resource requirements.
Installing Helm
The Spring Cloud Data Flow Helm chart is currently tested against Helm 2.
Helm is comprised of two components: the client (Helm) and the server (Tiller).
The Helm client runs on your local machine and can be installed by following the instructions found here.
If Tiller has not been installed on your cluster, run the following to create a service account and the Helm init client command:
kubectl create serviceaccount tiller -n kube-system
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount kube-system:tiller
helm init --wait --service-account tillerPlease see the Helm documentation for additional Helm security configuration.
helm repo updateTo verify that the Tiller pod is running, run the following command:
kubectl get pod --namespace kube-systemYou should see the Tiller pod running.
Installing the Spring Cloud Data Flow Server and Required Services
Spring Cloud Data Flow Chart
Spring Cloud Data Flow is a toolkit for microservices-based Streaming and Batch data processing pipelines in Cloud Foundry and Kubernetes
Data processing pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks. This makes Spring Cloud Data Flow suitable for a range of data processing use cases, from import/export to event streaming and predictive analytics.
This Helm chart is deprecated
Given the stable deprecation timeline, the Bitnami maintained Spring Cloud Data Flow Helm chart is now located at bitnami/charts.
The Bitnami repository is already included in the Hubs and we will continue providing the same cadence of updates, support, etc that we've been keeping here these years. Installation instructions are very similar, just adding the bitnami repo and using it during the installation (bitnami/<chart> instead of stable/<chart>)
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-release bitnami/<chart> # Helm 3
$ helm install --name my-release bitnami/<chart> # Helm 2To update an exisiting stable deployment with a chart hosted in the bitnami repository you can execute
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm upgrade my-release bitnami/<chart>Issues and PRs related to the chart itself will be redirected to bitnami/charts GitHub repository. In the same way, we'll be happy to answer questions related to this migration process in this issue created as a common place for discussion.
Chart Details
This chart will provision a fully functional and fully featured Spring Cloud Data Flow installation that can deploy and manage data processing pipelines in the cluster that it is deployed to.
Either the default MySQL deployment or an external database can be used as the data store for Spring Cloud Data Flow state and either RabbitMQ or Kafka can be used as the messaging layer for streaming apps to communicate with one another.
For more information on Spring Cloud Data Flow and its capabilities, see it's documentation.
Prerequisites
Assumes that serviceAccount credentials are available so the deployed Data Flow server can access the API server (Works on GKE and Minikube by default). See Configure Service Accounts for Pods
Installing the Chart
To install the chart with the release name my-release:
$ helm install --name my-release stable/spring-cloud-data-flowIf you are using a cluster that does not have a load balancer (like Minikube) then you can install using a NodePort:
$ helm install --name my-release --set server.service.type=NodePort stable/spring-cloud-data-flowTo restrict the load balancer to an IP address range:
$ helm install --name my-release --set server.service.loadBalancerSourceRanges='[10.0.0.0/8]' stable/spring-cloud-data-flowData Store
By default, MySQL is deployed with this chart. However, if you wish to use an external database, please use the following set flags to the helm command to disable MySQL deployment, for example:
--set mysql.enabled=false
In addition, you are required to set all fields listed in External Database Configuration.
Messaging Layer
There are three messaging layers available in this chart:
- RabbitMQ (default)
- RabbitMQ HA
- Kafka
To change the messaging layer to a highly available (HA) version of RabbitMQ, use the following set flags to the helm command, for example:
--set rabbitmq-ha.enabled=true,rabbitmq.enabled=false
Alternatively, to change the messaging layer to Kafka, use the following set flags to the helm command, for example:
--set kafka.enabled=true,rabbitmq.enabled=false
Only one messaging layer can be used at a given time. If RabbitMQ and Kafka are enabled, both charts will be installed with RabbitMQ being used in the deployment.
Note that this chart pulls in many different Docker images so can take a while to fully install.
Feature Toggles
If you only need to deploy tasks and schedules, streams can be disabled:
--set features.streaming.enabled=false --set rabbitmq.enabled=false
If you only need to deploy streams, tasks and schedules can be disabled:
--set features.batch.enabled=false
NOTE: Both features.streaming.enabled and features.batch.enabled should not be set to false at the same time.
Streaming and batch applications can be monitored through Prometheus and Grafana. To deploy these components and enable monitoring, set the following:
--set features.monitoring.enabled=true
When using Minikube, the Grafana URL can be obtained for example, via:
minikube service my-release-grafana --url
On a platform that provides a LoadBalancer such as GKE, the following can be checked against until the EXTERNAL-IP field is populated with the assigned load balancer IP address:
kubectl get svc my-release-grafana
See the Grafana table below for default credentials and override parameters.
Using an Ingress
If you would like to use an Ingress instead of having the services use the LoadBalancer type there are a few things to consider.
First you need to have an Ingress Controller installed in your cluster. If you don't already have one instaled, you can use the following helm command to install an NGINX Ingress Controller:
kubectl create namespace nginx-ingress
helm install --name nginx-ingress --namespace nginx-ingress stable/nginx-ingressYou can look up the IP address used by the NGINX Ingress Controller with:
ingress=$(kubectl get svc nginx-ingress-controller -n nginx-ingress -ojsonpath='{.status.loadBalancer.ingress[0].ip}')This is useful if you would like to use xip.io instead of your own DNS resolution. The folowing options assume that you will use xip.io but you can replace the host values below with your own DNS hosts if you prefer.
To enable the creation of an Ingress resource and configure the services to use ClusterIP type use the following set options in your helm install command:
--set server.service.type=ClusterIP \
--set ingress.enabled=true \
--set ingress.protocol=http \
--set ingress.server.host=scdf.${ingress}.xip.io \If you want to use an Ingress with the monitoring feature enabled, then use thes options instead:
--set features.monitoring.enabled=true \
--set server.service.type=ClusterIP \
--set grafana.service.type=ClusterIP \
--set prometheus.proxy.service.type=ClusterIP \
--set ingress.enabled=true \
--set ingress.protocol=http \
--set ingress.server.host=scdf.${ingress}.xip.io \
--set ingress.grafana.host=grafana.${ingress}.xip.io \Configuration
The following tables list the configurable parameters and their default values.
RBAC Configuration
| Parameter | Description | Default |
|---|---|---|
| rbac.create | Create RBAC configurations | true |
ServiceAccount Configuration
| Parameter | Description | Default |
|---|---|---|
| serviceAccount.create | Create ServiceAccount | true |
| serviceAccount.name | ServiceAccount name | (generated if not specified) |
Data Flow Server Configuration
| Parameter | Description | Default |
|---|---|---|
| server.version | The version/tag of the Data Flow server | 2.6.0 |
| server.imagePullPolicy | The imagePullPolicy of the Data Flow server | IfNotPresent |
| server.service.type | The service type for the Data Flow server | LoadBalancer |
| server.service.annotations | Extra annotations for service resource | {} |
| server.service.externalPort | The external port for the Data Flow server | 80 |
| server.service.labels | Extra labels for the service resource | {} |
| server.service.loadBalancerSourceRanges | A list of IP address ranges to allow through the load balancer | no restriction |
| server.platformName | The name of the configured platform account | default |
| server.configMap | Custom ConfigMap name for Data Flow server configuration | |
| server.trustCerts | Trust self signed certs | false |
| server.extraEnv | Extra environment variables to add to the server container | {} |
| server.containerConfiguration.container.registry-configurations. |
The registry host to use for the profile represented by |
|
| server.containerConfiguration.container.registry-configurations. |
The registry authorization type to use for the profile represented by |
Skipper Server Configuration
| Parameter | Description | Default |
|---|---|---|
| skipper.version | The version/tag of the Skipper server | 2.5.0 |
| skipper.imagePullPolicy | The imagePullPolicy of the Skipper server | IfNotPresent |
| skipper.platformName | The name of the configured platform account | default |
| skipper.service.type | The service type for the Skipper server | ClusterIP |
| skipper.service.annotations | Extra annotations for service resources | {} |
| skipper.service.labels | Extra labels for the service resource | {} |
| skipper.configMap | Custom ConfigMap name for Skipper server configuration | |
| skipper.trustCerts | Trust self signed certs | false |
| skipper.extraEnv | Extra environment variables to add to the skipper container | {} |
Spring Cloud Deployer for Kubernetes Configuration
| Parameter | Description | Default |
|---|---|---|
| deployer.resourceLimits.cpu | Deployer resource limit for cpu | 500m |
| deployer.resourceLimits.memory | Deployer resource limit for memory | 1024Mi |
| deployer.readinessProbe.initialDelaySeconds | Deployer readiness probe initial delay | 120 |
| deployer.livenessProbe.initialDelaySeconds | Deployer liveness probe initial delay | 90 |
RabbitMQ Configuration
| Parameter | Description | Default |
|---|---|---|
| rabbitmq.enabled | Enable RabbitMQ as the middleware to use | true |
| rabbitmq.rabbitmq.username | RabbitMQ user name | user |
| rabbitmq.rabbitmq.password | RabbitMQ password to encode into the secret | changeme |
RabbitMQ HA Configuration
| Parameter | Description | Default |
|---|---|---|
| rabbitmq-ha.enabled | Enable RabbitMQ HA as the middleware to use | false |
| rabbitmq-ha.rabbitmqUsername | RabbitMQ user name | user |
Kafka Configuration
| Parameter | Description | Default |
|---|---|---|
| kafka.enabled | Enable RabbitMQ as the middleware to use | false |
| kafka.replicas | The number of Kafka replicas to use | 1 |
| kafka.configurationOverrides | Kafka deployment configuration overrides | replication.factor=1, metrics.enabled=false |
| kafka.zookeeper.replicaCount | The number of ZooKeeper replicates to use | 1 |
MySQL Configuration
| Parameter | Description | Default |
|---|---|---|
| mysql.enabled | Enable deployment of MySQL | true |
| mysql.mysqlDatabase | MySQL database name | dataflow |
External Database Configuration
| Parameter | Description | Default |
|---|---|---|
| database.driver | Database driver | nil |
| database.scheme | Database scheme | nil |
| database.host | Database host | nil |
| database.port | Database port | nil |
| database.user | Database user | scdf |
| database.password | Database password | nil |
| database.dataflow | Database name for SCDF server | dataflow |
| database.skipper | Database name for SCDF skipper | skipper |
Feature Toggles
| Parameter | Description | Default |
|---|---|---|
| features.streaming.enabled | Enables or disables streams | true |
| features.batch.enabled | Enables or disables tasks and schedules | true |
| features.monitoring.enabled | Enables or disables monitoring | false |
Ingress
| Parameter | Description | Default |
|---|---|---|
| ingress.enabled | Enables or disables ingress support | true |
| ingress.protocol | Sets the protocol used by ingress server | https |
| ingress.server.host | Sets the host used for server | data-flow.local |
| ingress.server.host | Sets the host used for grafana | grafana.local |
Grafana
| Parameter | Description | Default |
|---|---|---|
| grafana.service.type | Service type to use | LoadBalancer |
| grafana.admin.existingSecret | Existing Secret to use for login credentials | scdf-grafana-secret |
| grafana.admin.userKey | Secret userKey field | admin-user |
| grafana.admin.passwordKey | Secret passwordKey field | admin-password |
| grafana.admin.defaultUsername | The default base64 encoded login username used in the secret | admin |
| grafana.admin.defaultPassword | The default base64 encoded login password used in the secret | password |
| grafana.extraConfigmapMounts | ConfigMap mount for datasources | scdf-grafana-ds-cm |
| grafana.dashboardProviders | Dashboard provider for imported dashboards | default |
| grafana.dashboards | Dashboards to auto import | SCDF Apps, Streams & Tasks |
Prometheus
| Parameter | Description | Default |
|---|---|---|
| prometheus.server.global.scrape_interval | Scrape interval | 10s |
| prometheus.server.global.scrape_timeout | Scrape timeout | 9s |
| prometheus.server.global.evaluation_interval | Evaluation interval | 10s |
| prometheus.extraScrapeConfigs | Additional scrape configs for proxied applications | proxied-applications & proxies jobs |
| prometheus.podSecurityPolicy | Enable or disable PodSecurityContext | true |
| prometheus.alertmanager | Enable or disable alert manager | false |
| prometheus.kubeStateMetrics | Enable or disable kube state metrics | false |
| prometheus.nodeExporter | Enable or disable node exporter | false |
| prometheus.pushgateway | Enable or disable push gateway | false |
| prometheus.proxy.service.type | Service type to use | LoadBalancer |
Expected output
After issuing the helm install command, you should see output similar to the following:
NAME: my-release
LAST DEPLOYED: Sat Mar 10 11:33:29 2018
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
==> v1/Secret
NAME TYPE DATA AGE
my-release-mysql Opaque 2 1s
my-release-data-flow Opaque 2 1s
my-release-rabbitmq Opaque 2 1s
==> v1/ConfigMap
NAME DATA AGE
my-release-data-flow-server 1 1s
my-release-data-flow-skipper 1 1s
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
my-release-rabbitmq Bound pvc-e9ed7f55-2499-11e8-886f-08002799df04 8Gi RWO standard 1s
my-release-mysql Pending standard 1s
==> v1/ServiceAccount
NAME SECRETS AGE
my-release-data-flow 1 1s
==> v1/Service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-release-mysql 10.110.98.253 <none> 3306/TCP 1s
my-release-data-flow-server 10.105.216.155 <pending> 80:32626/TCP 1s
my-release-rabbitmq 10.106.76.215 <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 1s
my-release-data-flow-skipper 10.100.28.64 <none> 80/TCP 1s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
my-release-mysql 1 1 1 0 1s
my-release-rabbitmq 1 1 1 0 1s
my-release-data-flow-skipper 1 1 1 0 1s
my-release-data-flow-server 1 1 1 0 1sGet the Spring Cloud Data Flow's application URL by running these commands:
export SERVICE_IP=$(kubectl get svc --namespace default my-release-data-flow-server -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SERVICE_IP:80It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of the server by running kubectl get svc -w my-release-data-flow-server
If your using Minikube, you can use the following command to get the URL for the server:
minikube service --url my-release-data-flow-serverYou have just created a new release in the default namespace of your Kubernetes cluster.
It takes a couple of minutes for the application and its required services to start.
You can check on the status by issuing a kubectl get pod -w command.
You need to wait for the READY column to show 1/1 for all pods.
When all pods are ready, you can access the Spring Cloud Data Flow dashboard by accessing http://<SERVICE_ADDRESS>/dashboard where <SERVICE_ADDRESS> is the address returned by either the kubectl or minikube commands above.
To see what Helm releases of Spring Cloud Data Flow you have running, you can use the helm list command.
When it is time to delete the previously installed SCDF release, run helm delete my-release.
This command removes any resources created for the release but keeps release information so that you can rollback any changes by using a helm rollback my-release 1 command.
To completely delete the release and purge any release metadata, you can use helm delete my-release --purge.
Secret management
There is an issue with generated secrets that are used for the required services getting rotated on chart upgrades. To avoid this issue, set the password for these services when installing the chart. You can use the following command to do so:
helm install --name my-release \
--set rabbitmq.rabbitmqPassword=rabbitpwd \
--set mysql.mysqlRootPassword=mysqlpwd incubator/spring-cloud-data-flowVersion Compatibility
The following listing shows Spring Cloud Data Flow’s version compatibility with the respective Helm Chart releases:
| SCDF Version | Chart Version |
|---|---|
| SCDF-K8S-Server 1.7.x | 1.0.x |
| SCDF-K8S-Server 2.0.x | 2.2.x |
| SCDF-K8S-Server 2.1.x | 2.3.x |
| SCDF-K8S-Server 2.2.x | 2.4.x |
| SCDF-K8S-Server 2.3.x | 2.5.x |
Register prebuilt applications
All the prebuilt streaming applications:
- Are available as Apache Maven artifacts or Docker images.
- Use RabbitMQ or Apache Kafka.
- Support monitoring via Prometheus and InfluxDB.
- Contain metadata for application properties used in the UI and code completion in the shell.
Applications can be registered individually using the app register functionality or as a group using the app import functionality.
There are also dataflow.spring.io links that represent the group of prebuilt applications for a specific release which is useful for getting started.
You can register applications using the UI or the shell. Even though we are only using two prebuilt applications, we will register the full set of prebuilt applications.
The easiest way to install Data Flow on Kubernetes is using the Helm chart that uses RabbitMQ as the default messaging middleware. The command to import the Kafka version of the applications is
dataflow:>app import --uri https://dataflow.spring.io/kafka-docker-latestChange kafka to rabbitmq in the above URL if you set kafka.enabled=true in the helm chart or followed the manual kubectl based installation instructions for installing Data Flow on Kubernetes and chose to use Kafka as the messaging middleware.
Only applications registered with a --uri property
pointing to a Docker resource are supported by the Data Flow Server
for Kubernetes. However, we do support Maven resources for the
--metadata-uri property, which is used to list the properties
supported by each application. For example, the following application
registration is valid:
app register --type source --name time --uri docker://springcloudstream/time-source-rabbit:{docker-time-source-rabbit-version} --metadata-uri maven://org.springframework.cloud.stream.app:time-source-rabbit:jar:metadata:{docker-time-source-rabbit-version}Any application registered with a Maven, HTTP, or File resource or the executable jar (by using a --uri property prefixed with
maven://, http:// or file://) is not supported.
Application and Server Properties
This section covers how you can customize the deployment of your applications. You can use a number of properties to influence settings for the applications that are deployed. Properties can be applied on a per-application basis or in the appropriate server configuration for all deployed applications.
Properties set on a per-application basis always take precedence over properties set as the server configuration. This arrangement lets you override global server level properties on a per-application basis.
Properties to be applied for all deployed Tasks are defined in the
src/kubernetes/server/server-config.yaml file and for Streams
in src/kubernetes/skipper/skipper-config-(binder).yaml. Replace
(binder) with the messaging middleware you are using — for example,
rabbit or kafka.
Memory and CPU Settings
Applications are deployed with default memory and CPU settings. If
needed, these values can be adjusted. The following example shows how to
set Limits to 1000m for CPU and 1024Mi for memory and Requests
to 800m for CPU and 640Mi for memory:
deployer.<app>.kubernetes.limits.cpu=1000m
deployer.<app>.kubernetes.limits.memory=1024Mi
deployer.<app>.kubernetes.requests.cpu=800m
deployer.<app>.kubernetes.requests.memory=640MiThose values results in the following container settings being used:
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 800m
memory: 640MiYou can also control the default values to which to set the cpu and
memory globally.
The following example shows how to set the CPU and memory for streams and tasks:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500mdata:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500mThe settings we have used so far only affect the settings for the container. They do not affect the memory setting for the JVM process in the container. If you would like to set JVM memory settings, you can provide an environment variable to do so. See the next section for details.
Environment Variables
To influence the environment settings for a given application, you can
use the spring.cloud.deployer.kubernetes.environmentVariables deployer
property. For example, a common requirement in production settings is to
influence the JVM memory arguments. You can do so by using the
JAVA_TOOL_OPTIONS environment variable, as the following example
shows:
deployer.<app>.kubernetes.environmentVariables=JAVA_TOOL_OPTIONS=-Xmx1024mThe environmentVariables property accepts a comma-delimited string.
If an environment variable contains a value which is also a
comma-delimited string, it must be enclosed in single quotation marks — for example,
spring.cloud.deployer.kubernetes.environmentVariables=spring.cloud.stream.kafka.binder.brokers='somehost:9092, anotherhost:9093'This overrides the JVM memory setting for the desired <app> (replace
<app> with the name of your application).
Liveness and Readiness Probes
The liveness and readiness probes use paths called /health and
/info, respectively. They use a delay of 10 for both and a
period of 60 and 10 respectively. You can change these defaults
when you deploy the stream by using deployer properties. Liveness and
readiness probes are only applied to streams.
The following example changes the liveness probe (replace <app> with
the name of your application) by setting deployer properties:
deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.livenessProbeDelay=120
deployer.<app>.kubernetes.livenessProbePeriod=20You can declare the same as part of the server global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
livenessProbePath: /health
livenessProbeDelay: 120
livenessProbePeriod: 20Similarly, you can swap liveness for readiness to override the
default readiness settings.
By default, port 8080 is used as the probe port. You can change the
defaults for both liveness and readiness probe ports by using
deployer properties, as the following example shows:
deployer.<app>.kubernetes.readinessProbePort=7000
deployer.<app>.kubernetes.livenessProbePort=7000You can declare the same as part of the global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
readinessProbePort: 7000
livenessProbePort: 7000By default, the liveness and readiness probe paths use Spring Boot
2.x+ actuator endpoints. To use Spring Boot 1.x actuator endpoint
paths, you must adjust the liveness and readiness values, as the
following example shows (replace <app> with the name of your
application):
deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.readinessProbePath=/infoTo automatically set both liveness and readiness endpoints on a
per-application basis to the default Spring Boot 1.x paths, you can set
the following property:
deployer.<app>.kubernetes.bootMajorVersion=1You can access secured probe endpoints by using credentials stored in a
Kubernetes
secret. You
can use an existing secret, provided the credentials are contained under
the credentials key name of the secret’s data block. You can
configure probe authentication on a per-application basis. When enabled,
it is applied to both the liveness and readiness probe endpoints by
using the same credentials and authentication type. Currently, only
Basic authentication is supported.
To create a new secret:
-
Generate the base64 string with the credentials used to access the secured probe endpoints.
Basic authentication encodes a username and password as a base64 string in the format of
username:password.The following example (which includes output and in which you should replace
userandpasswith your values) shows how to generate a base64 string:echo -n "user:pass" | base64 dXNlcjpwYXNz -
With the encoded credentials, create a file (for example,
myprobesecret.yml) with the following contents:apiVersion: v1 kind: Secret metadata: name: myprobesecret type: Opaque data: credentials: GENERATED_BASE64_STRING - Replace
GENERATED_BASE64_STRINGwith the base64-encoded value generated earlier. -
Create the secret by using
kubectl, as the following example shows:kubectl create -f ./myprobesecret.yml secret "myprobesecret" created -
Set the following deployer properties to use authentication when accessing probe endpoints, as the following example shows:
deployer.<app>.kubernetes.probeCredentialsSecret=myprobesecretReplace
<app>with the name of the application to which to apply authentication.
Using SPRING_APPLICATION_JSON
You can use a SPRING_APPLICATION_JSON environment variable to set Data
Flow server properties (including the configuration of maven repository
settings) that are common across all of the Data Flow server
implementations. These settings go at the server level in the container
env section of a deployment YAML. The following example shows how to
do so:
env:
- name: SPRING_APPLICATION_JSON
value: |-
{
"maven": {
"local-repository": null,
"remote-repositories": {
"repo1": {
"url": "https://repo.spring.io/libs-snapshot"
}
}
}
}Private Docker Registry
You can pull Docker images from a private registry on a per-application basis. First, you must create a secret in the cluster. Follow the Pull an Image from a Private Registry guide to create the secret.
Once you have created the secret, you can use the imagePullSecret
property to set the secret to use, as the following example shows:
deployer.<app>.kubernetes.imagePullSecret=mysecretReplace <app> with the name of your application and mysecret with
the name of the secret you created earlier.
You can also configure the image pull secret at the global server level.
The following example shows how to do so for streams and tasks:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecretdata:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecretReplace mysecret with the name of the secret you created earlier.
Volume Mounted Secretes
Data Flow uses the application metadata stored in a container image label. To access the metadata labels in a private registry, you have to extend the Data Flow deployment configuration and mount the registry secrets as a Secrets PropertySource:
spec:
containers:
- name: scdf-server
...
volumeMounts:
- name: mysecret
mountPath: /etc/secrets/mysecret
readOnly: true
...
volumes:
- name: mysecret
secret:
secretName: mysecretAnnotations
You can add annotations to Kubernetes objects on a per-application
basis. The supported object types are pod Deployment, Service, and
Job. Annotations are defined in a key:value format, allowing for
multiple annotations separated by a comma. For more information and use
cases on annotations, see
Annotations.
The following example shows how you can configure applications to use annotations:
deployer.<app>.kubernetes.podAnnotations=annotationName:annotationValue
deployer.<app>.kubernetes.serviceAnnotations=annotationName:annotationValue,annotationName2:annotationValue2
deployer.<app>.kubernetes.jobAnnotations=annotationName:annotationValueReplace <app> with the name of your application and the value of your
annotations.
Entry Point Style
An entry point style affects how application properties are passed to the container to be deployed. Currently, three styles are supported:
exec(default): Passes all application properties and command line arguments in the deployment request as container arguments. Application properties are transformed into the format of--key=value.shell: Passes all application properties and command line arguments as environment variables. Each of the application and command line argument properties is transformed into an uppercase string and.characters are replaced with_.boot: Creates an environment variable calledSPRING_APPLICATION_JSONthat contains a JSON representation of all application properties. Command line arguments from the deployment request are set as container args.
In all cases, environment variables defined at the server-level configuration and on a per-application basis are set onto the container as is.
You can configure applications as follows:
deployer.<app>.kubernetes.entryPointStyle=<Entry Point Style>Replace <app> with the name of your application and
<Entry Point Style> with your desired entry point style.
You can also configure the entry point style at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyleThe following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyleReplace entryPointStye with the desired entry point style.
You should choose an Entry Point Style of either exec or shell, to
correspond to how the ENTRYPOINT syntax is defined in the container’s
Dockerfile. For more information and uses cases on exec versus
shell, see the
ENTRYPOINT
section of the Docker documentation.
Using the boot entry point style corresponds to using the exec style
ENTRYPOINT. Command line arguments from the deployment request are
passed to the container, with the addition of application properties
being mapped into the SPRING_APPLICATION_JSON environment variable
rather than command line arguments.
When you use the boot Entry Point Style, the deployer.<app>.kubernetes.environmentVariables property must not
contain SPRING_APPLICATION_JSON.
Deployment Service Account
You can configure a custom service account for application deployments
through properties. You can use an existing service account or create a
new one. One way to create a service account is by using kubectl, as
the following example shows:
kubectl create serviceaccount myserviceaccountname
serviceaccount "myserviceaccountname" createdThen you can configure individual applications as follows:
deployer.<app>.kubernetes.deploymentServiceAccountName=myserviceaccountnameReplace <app> with the name of your application and
myserviceaccountname with your service account name.
You can also configure the service account name at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountnameThe following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountnameReplace myserviceaccountname with the service account name to be
applied to all deployments.
Image Pull Policy
An image pull policy defines when a Docker image should be pulled to the local registry. Currently, three policies are supported:
IfNotPresent(default): Do not pull an image if it already exists.Always: Always pull the image regardless of whether it already exists.Never: Never pull an image. Use only an image that already exists.
The following example shows how you can individually configure applications:
deployer.<app>.kubernetes.imagePullPolicy=AlwaysReplace <app> with the name of your application and Always with your
desired image pull policy.
You can configure an image pull policy at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: AlwaysThe following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: AlwaysReplace Always with your desired image pull policy.
Deployment Labels
You can set custom labels on objects related to
Deployment.
See
Labels
for more information on labels. Labels are specified in key:value
format.
The following example shows how you can individually configure applications:
deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValueReplace <app> with the name of your application, myLabelName with
your label name, and myLabelValue with the value of your label.
Additionally, you can apply multiple labels, as the following example shows:
deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValue,myLabelName2:myLabelValue2NodePort
Applications are deployed using a Service type of ClusterIP which is the default Kubernetes Service type if not defined otherwise.
ClusterIP services are only reachable from within the cluster itself.
To expose the deployed application to be available externally, one option is to use NodePort.
See the NodePort documentation for more information.
The following example shows how you can individually configure applications using Kubernetes assigned ports:
deployer.<app>.kubernetes.createNodePort=trueReplace <app> with the name of your application.
Additionally, you can define the port to use for the NodePort Service as shown below:
deployer.<app>.kubernetes.createNodePort=31101Replace <app> with the name of your application and the value of 31101 with your desired port.
When defining the port manually, the port must not already be in use and within the defined NodePort range.
Per NodePort the default port range is 30000-32767.
Monitoring
To learn more about the monitoring experience in Data Flow using Prometheus running on Kubernetes, please refer to the Stream Monitoring feature guide.