Kubernetes-master charm
This charm is an encapsulation of the Kubernetes master processes and the operations to run on any cloud for the entire lifecycle of the cluster.
This charm is built from other charm layers using the Juju reactive framework. The other layers focus on specific subset of operations making this layer specific to operations of Kubernetes master processes.
Deployment
This charm is not fully functional when deployed by itself. It requires other charms to model a complete Kubernetes cluster. A Kubernetes cluster needs a distributed key value store such as Etcd and the kubernetes-worker charm which delivers the Kubernetes node services. A cluster requires a Software Defined Network (SDN), a Container Runtime such as containerd, and Transport Layer Security (TLS) so the components in a cluster communicate securely.
Please take a look at the Charmed Kubernetes or the Kubernetes core bundles for examples of complete models of Kubernetes clusters.
Resources
The kubernetes-master charm takes advantage of the Juju Resources feature to deliver the Kubernetes software.
In deployments on public clouds the Charm Store provides the resource to the charm automatically with no user intervention. Some environments with strict firewall rules may not be able to contact the Charm Store. In these network restricted environments the resource can be uploaded to the model by the Juju operator.
Snap Refresh
The kubernetes resources used by this charm are snap packages. When not
specified during deployment, these resources come from the public store. By
default, the snapd
daemon will refresh all snaps installed from the store
four (4) times per day. A charm configuration option is provided for operators
to control this refresh frequency.
NOTE: this is a global configuration option and will affect the refresh time for all snaps installed on a system.
Examples:
## refresh kubernetes-master snaps every tuesday
juju config kubernetes-master snapd_refresh="tue"
## refresh snaps at 11pm on the last (5th) friday of the month
juju config kubernetes-master snapd_refresh="fri5,23:00"
## delay the refresh as long as possible
juju config kubernetes-master snapd_refresh="max"
## use the system default refresh timer
juju config kubernetes-master snapd_refresh=""
For more information, see the snap documentation.
Configuration
This charm supports some configuration options to set up a Kubernetes cluster that works in your environment, detailed in the section below.
For some specific Kubernetes service configuration tasks, please also see the section on configuring K8s services.
name | type | Default | Description |
---|---|---|---|
allow-privileged | string | auto | See notes |
api-extra-args | string | See notes | |
audit-policy | string | See notes | Audit policy passed to kube-apiserver via --audit-policy-file. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/tasks/debug-application-cluster/audit/ |
audit-webhook-config | string | Audit webhook config passed to kube-apiserver via --audit-webhook-config-file. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/tasks/debug-application-cluster/audit/ | |
authn-webhook-endpoint | string | See notes | |
authorization-mode | string | Node,RBAC | Comma separated authorization modes. Allowed values are "RBAC", "Node", "Webhook", "ABAC", "AlwaysDeny" and "AlwaysAllow". |
cephfs-mounter | string | default | The client driver used for cephfs based storage. Options are "fuse", "kernel" and "default". |
channel | string | 1.22/stable | Snap channel to install Kubernetes master services from |
client_password | string | Password to be used for admin user (leave empty for random password). | |
controller-manager-extra-args | string | See notes | |
dashboard-auth | string | auto | See notes |
default-cni | string | See notes | |
default-storage | string | auto | The storage class to make the default storage class. Allowed values are "auto", "none", "ceph-xfs", "ceph-ext4", "cephfs". Note: Only works in Kubernetes >= 1.10 |
dns-provider | string | auto | See notes |
dns_domain | string | cluster.local | The local domain for cluster dns |
enable-dashboard-addons | boolean | True | Deploy the Kubernetes Dashboard |
enable-keystone-authorization | boolean | False | If true and the Keystone charm is related, users will authorize against the Keystone server. Note that if related, users will always authenticate against Keystone. |
enable-metrics | boolean | True | If true the metrics server for Kubernetes will be deployed onto the cluster. |
enable-nvidia-plugin | string | auto | Load the nvidia device plugin daemonset. Supported values are "auto" and "false". When "auto", the daemonset will be loaded only if GPUs are detected. When "false" the nvidia device plugin will not be loaded. |
extra_packages | string | Space separated list of extra deb packages to install. | |
extra_sans | string | Space-separated list of extra SAN entries to add to the x509 certificate created for the master nodes. | |
ha-cluster-dns | string | DNS entry to use with the HA Cluster subordinate charm. Mutually exclusive with ha-cluster-vip. | |
ha-cluster-vip | string | Virtual IP for the charm to use with the HA Cluster subordinate charm Mutually exclusive with ha-cluster-dns. Multiple virtual IPs are separated by spaces. | |
image-registry | string | See notes | Container image registry to use for CDK. This includes addons like the Kubernetes dashboard, metrics server, ingress, and dns along with non-addon images including the pause container and default backend image. |
install_keys | string | See notes | |
install_sources | string | See notes | |
keystone-policy | string | See notes | Policy for Keystone authorization. This is used when a Keystone charm is related to kubernetes-master in order to provide authorization for Keystone users on the Kubernetes cluster. |
keystone-ssl-ca | string | Keystone certificate authority encoded in base64 for securing communications to Keystone. For example: juju config kubernetes-master keystone-ssl-ca=$(base64 /path/to/ca.crt) |
|
loadbalancer-ips | string | See notes | |
nagios_context | string | juju | See notes |
nagios_servicegroups | string | A comma-separated list of nagios servicegroups. If left empty, the nagios_context will be used as the servicegroup | |
package_status | string | install | The status of service-affecting packages will be set to this value in the dpkg database. Valid values are "install" and "hold". |
proxy-extra-args | string | See notes | |
require-manual-upgrade | boolean | True | When true, master nodes will not be upgraded until the user triggers it manually by running the upgrade action. |
scheduler-extra-args | string | See notes | |
service-cidr | string | 10.152.183.0/24 | CIDR to use for Kubernetes services. After deployment it is only possible to increase the size of the IP range. It is not possible to change or shrink the address range after deployment. |
snapd_refresh | string | max | See notes |
storage-backend | string | auto | The storage backend for kube-apiserver persistence. Can be "etcd2", "etcd3", or "auto". Auto mode will select etcd3 on new installations, or etcd2 on upgrades. |
sysctl | string | See notes | See notes |
allow-privileged
Allow kube-apiserver to run in privileged mode. Supported values are "true", "false", and "auto". If "true", kube-apiserver will run in privileged mode by default. If "false", kube-apiserver will never run in privileged mode. If "auto", kube-apiserver will not run in privileged mode by default, but will switch to privileged mode if gpu hardware is detected on a worker node.
api-extra-args
Space separated list of flags and key=value pairs that will be passed as arguments to kube-apiserver. For example a value like this:
runtime-config=batch/v2alpha1=true profiling=true
will result in kube-apiserver being run with the following options: --runtime-config=batch/v2alpha1=true --profiling=true
audit-policy
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Don't log read-only requests from the apiserver
- level: None
users: ["system:apiserver"]
verbs: ["get", "list", "watch"]
# Don't log kube-proxy watches
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- resources: ["endpoints", "services"]
# Don't log nodes getting their own status
- level: None
userGroups: ["system:nodes"]
verbs: ["get"]
resources:
- resources: ["nodes"]
# Don't log kube-controller-manager and kube-scheduler getting endpoints
- level: None
users: ["system:unsecured"]
namespaces: ["kube-system"]
verbs: ["get"]
resources:
- resources: ["endpoints"]
# Log everything else at the Request level.
- level: Request
omitStages:
- RequestReceived
authn-webhook-endpoint
Custom endpoint to check when authenticating kube-apiserver requests. This must be an https url accessible by the k8s-master units. For example:
https://your.server:8443/authenticate
When a JSON-serialized TokenReview object is POSTed to this endpoint, it must respond with appropriate authentication details. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/reference/access-authn-authz/authentication/#webhook-token-authentication
controller-manager-extra-args
Space separated list of flags and key=value pairs that will be passed as arguments to kube-controller-manager. For example a value like this:
runtime-config=batch/v2alpha1=true profiling=true
will result in kube-controller-manager being run with the following options: --runtime-config=batch/v2alpha1=true --profiling=true
dashboard-auth
Method of authentication for the Kubernetes dashboard. Allowed values are "auto", "basic", and "token". If set to "auto", basic auth is used unless Keystone is related to kubernetes-master, in which case token auth is used.
DEPRECATED: this option has no effect on Kubernetes 1.19 and above.
default-cni
Default CNI network to use when multiple CNI subordinates are related.
The value of this config should be the application name of a related CNI subordinate. For example:
juju config kubernetes-master default-cni=flannel
If unspecified, then the default CNI network is chosen alphabetically.
dns-provider
DNS provider addon to use. Can be "auto", "core-dns", "kube-dns", or "none".
CoreDNS is only supported on Kubernetes 1.14+.
When set to "auto", the behavior is as follows:
- New deployments of Kubernetes 1.14+ will use CoreDNS
- New deployments of Kubernetes 1.13 or older will use KubeDNS
- Upgraded deployments will continue to use whichever provider was previously used.
image-registry
rocks.canonical.com:443/cdk
install_keys
List of signing keys for install_sources package sources, per charmhelpers standard format (a yaml list of strings encoded as a string). The keys should be the full ASCII armoured GPG public keys. While GPG key ids are also supported and looked up on a keyserver, operators should be aware that this mechanism is insecure. null can be used if a standard package signing key is used that will already be installed on the machine, and for PPA sources where the package signing key is securely retrieved from Launchpad.
install_sources
List of extra apt sources, per charm-helpers standard format (a yaml list of strings encoded as a string). Each source may be either a line that can be added directly to sources.list(5), or in the form ppa:
keystone-policy
apiVersion: v1
kind: ConfigMap
metadata:
name: k8s-auth-policy
namespace: kube-system
labels:
k8s-app: k8s-keystone-auth
data:
policies: |
[
{
"resource": {
"verbs": ["get", "list", "watch"],
"resources": ["*"],
"version": "*",
"namespace": "*"
},
"match": [
{
"type": "role",
"values": ["k8s-viewers"]
},
{
"type": "project",
"values": ["k8s"]
}
]
},
{
"resource": {
"verbs": ["*"],
"resources": ["*"],
"version": "*",
"namespace": "default"
},
"match": [
{
"type": "role",
"values": ["k8s-users"]
},
{
"type": "project",
"values": ["k8s"]
}
]
},
{
"resource": {
"verbs": ["*"],
"resources": ["*"],
"version": "*",
"namespace": "*"
},
"match": [
{
"type": "role",
"values": ["k8s-admins"]
},
{
"type": "project",
"values": ["k8s"]
}
]
}
]
loadbalancer-ips
Space separated list of IP addresses of loadbalancers in front of the control plane. These can be either virtual IP addresses that have been floated in front of the control plane or the IP of a loadbalancer appliance such as an F5. Workers will alternate IP addresses from this list to distribute load - for example If you have 2 IPs and 4 workers, each IP will be used by 2 workers. Note that this will only work if kubeapi-load-balancer is not in use and there is a relation between kubernetes-master:kube-api-endpoint and kubernetes-worker:kube-api-endpoint. If using the kubeapi-load-balancer, see the loadbalancer-ips configuration variable on the kubeapi-load-balancer charm.
nagios_context
Used by the nrpe subordinate charms. A string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like:
juju-myservice-0
If you're running multiple environments with the same services in them this allows you to differentiate between them.
proxy-extra-args
Space separated list of flags and key=value pairs that will be passed as arguments to kube-proxy. For example a value like this:
runtime-config=batch/v2alpha1=true profiling=true
will result in kube-apiserver being run with the following options: --runtime-config=batch/v2alpha1=true --profiling=true
scheduler-extra-args
Space separated list of flags and key=value pairs that will be passed as arguments to kube-scheduler. For example a value like this:
runtime-config=batch/v2alpha1=true profiling=true
will result in kube-scheduler being run with the following options: --runtime-config=batch/v2alpha1=true --profiling=true
snapd_refresh
How often snapd handles updates for installed snaps. Setting an empty string will check 4x per day. Set to "max" to delay the refresh as long as possible. You may also set a custom string as described in the 'refresh.timer' section here:
https://forum.snapcraft.io/t/system-options/87
DEPRECATED in 1.19: Manage installed snap versions with the snap-store-proxy model config. See: https://snapcraft.io/snap-store-proxy and https://juju.is/docs/offline-mode-strategies#heading--snap-specific-proxy
sysctl
{ net.ipv4.conf.all.forwarding : 1, net.ipv4.neigh.default.gc_thresh1 : 128, net.ipv4.neigh.default.gc_thresh2 : 28672, net.ipv4.neigh.default.gc_thresh3 : 32768, net.ipv6.neigh.default.gc_thresh1 : 128, net.ipv6.neigh.default.gc_thresh2 : 28672, net.ipv6.neigh.default.gc_thresh3 : 32768, fs.inotify.max_user_instances : 8192, fs.inotify.max_user_watches : 1048576, kernel.panic : 10, kernel.panic_on_oops: 1, vm.overcommit_memory : 1 }
YAML formatted associative array of sysctl values, e.g.: '{kernel.pid_max : 4194303 }'. Note that kube-proxy handles the conntrack settings. The proper way to alter them is to use the proxy-extra-args config to set them, e.g.:
juju config kubernetes-master proxy-extra-args="conntrack-min=1000000 conntrack-max-per-core=250000"
juju config kubernetes-worker proxy-extra-args="conntrack-min=1000000 conntrack-max-per-core=250000"
The proxy-extra-args conntrack-min and conntrack-max-per-core can be set to 0 to ignore kube-proxy's settings and use the sysctl settings instead. Note the fundamental difference between the setting of conntrack-max-per-core vs nf_conntrack_max.
Configuring K8s services
Charmed Kubernetes ships with sensible, tested default configurations to ensure a reliable Kubernetes experience, but of course these can be changed to reflect the purpose and resources of your cluster. The configuration section above details all available configuration options, this section deals with specific, commonly used settings. You may wish to also read the Addons page for information on the extra services installed with Charmed Kubernetes.
IPVS (IP Virtual Server)
IPVS implements transport-layer load balancing as part of the Linux kernel, and
can be used by the kube-proxy
service to handle service routing. By default
kube-proxy
uses a solution based on iptables, but this can cause a lot of
overhead in systems with large numbers of nodes. There is more information on
this in the upstream Kubernetes IPVS deep dive documentation.
IPVS is an extra option for kube-proxy, and can be enabled by changing the configuration:
juju config kubernetes-master proxy-extra-args="proxy-mode=ipvs"
It is also necessary to change this configuration option on the worker:
juju config kubernetes-worker proxy-extra-args="proxy-mode=ipvs"
Admission controls
As with other aspects of the Kubernetes API, admission controls can be enabled by adding extra values to the charm's api-extra-args configuration.
For admission controls, it may be useful to refer to the
Kubernetes blog for more information on the options, but
for example, to add the PodSecurityPolicy
admission controller:
- Check any current config settings for
api-extra-args
(there are none by default):juju config kubernetes-master api-extra-args
- Append the desired config option to the previous output and apply:
juju config kubernetes-master api-extra-args="enable-admission-plugins=PodSecurityPolicy"
Note that prior to Kubernetes 1.16 (kubernetes-master revision 778), the config
setting was admission-control
, rather than enable-admission-plugins
.
Adding SANs and certificate regeneration
As explained in the Certificates and trust overview, the
extra_sans
configuration settings can be used to add
SANs and regenerate x509 certificate(s) for the API server running on the
Kubernetes master node(s), and for the load balancer. When this configuration is
changed, the master node(s) will regenerate its certificate and restart the API
server to update the certificate used for communication. Note: This is
disruptive and restarts the API server.
The process is the same for both the kubernetes-master
and the
kubeapi-load-balancer
. The configuration option takes a space-separated list
of extra entries:
juju config kubernetes-master extra_sans="master.mydomain.com lb.mydomain.com"
juju config kubeapi-load-balancer extra_sans="master.mydomain.com lb.mydomain.com"
To clear the entries out of the certificate, use an empty string:
juju config kubernetes-master extra_sans=""
juju config kubeapi-load-balancer extra_sans=""
DNS for the cluster
The DNS add-on allows pods to have DNS names in addition to IP addresses. The Kubernetes cluster DNS server (based on the SkyDNS library) supports forward lookups (A records), service lookups (SRV records) and reverse IP address lookups (PTR records). More information about the DNS can be obtained from the Kubernetes DNS admin guide.
Actions
You can run an action with the following
juju run-action kubernetes-master ACTION [parameters] [--wait]
apply-manifest
Apply JSON formatted Kubernetes manifest to cluster
This action has the following parameters:
json
The content of the manifest to deploy in JSON format
Default:
cis-benchmark
Run the CIS Kubernetes Benchmark against snap-based components.
This action has the following parameters:
apply
Apply remediations to address benchmark failures. The default, 'none', will not attempt to fix any reported failures. Set to 'conservative' to resolve simple failures. Set to 'dangerous' to attempt to resolve all failures. Note: Applying any remediation may result in an unusable cluster.
Default: none
config
Archive containing configuration files to use when running kube-bench. The default value is known to be compatible with snap components. When using a custom URL, append '#<hash_type>=<checksum>' to verify the archive integrity when downloaded.
Default: https://github.com/charmed-kubernetes/kube-bench-c onfig/archive/cis-1.5.zip#sha1=cb8e78712ee5bfeab87 d0ed7c139a83e88915530
release
Set the kube-bench release to run. If set to 'upstream', the action will compile and use a local kube-bench binary built from the master branch of the upstream repository: https://github.com/aquasecurity/kube-bench This value may also be set to an accessible archive containing a pre-built kube-bench binary, for example: https://github.com/aquasecurity/kube- bench/releases/download/v0.0.34/kube-bench_0.0.34_ linux_amd64.tar.gz#sha256=f96d1fcfb84b18324f1299db 074d41ef324a25be5b944e79619ad1a079fca077
Default: https://github.com/aquasecurity/kube- bench/releases/download/v0.2.3/kube-bench_0.2.3_li nux_amd64.tar.gz#sha256=429a1db271689aafec009434de d1dea07a6685fee85a1deea638097c8512d548
create-rbd-pv
Create RADOS Block Device (RDB) volume in Ceph and creates PersistentVolume. Note this is deprecated on Kubernetes >= 1.10 in favor of CSI, where PersistentVolumes are created dynamically to back PersistentVolumeClaims.
This action has the following parameters:
filesystem
File system type to format the volume.
Default: xfs
mode
Access mode for the persistent volume.
Default: ReadWriteOnce
name
Name the persistent volume.
Default:
size
Size in MB of the RBD volume.
Default:
skip-size-check
Allow creation of overprovisioned RBD.
Default: False
debug
Collect debug data
get-kubeconfig
Retrieve Kubernetes cluster config, including credentials
namespace-create
Create new namespace
This action has the following parameters:
name
Namespace name eg. staging
Default:
namespace-delete
Delete namespace
This action has the following parameters:
name
Namespace name eg. staging
Default:
namespace-list
List existing k8s namespaces
restart
Restart the Kubernetes master services on demand.
upgrade
Upgrade the kubernetes snaps
This action has the following parameters:
fix-cluster-name
If using the OpenStack cloud provider, whether to fix the cluster name sent to it to include the cluster tag. This fixes an issue with load balancers conflicting with other clusters in the same project but will cause new load balancers to be created which will require manual intervention to resolve.
Default: True