Argus works by configuring a custom Kubernetes resource that defines paths and events that you want to be notified about for your current deployments. This custom resource, in conjunction with a cluster controller running and listening for lifecycle events, is responsible for maintaining a source of truth between the state of the cluster and the daemons listening for filesystem events on each node.
Topics
Prerequisites
In order to properly run these components, a Kubernetes cluster running v1.9 or above is required. Depending on your environment, there may be additional requirements to run both the daemon with Linux capabilities and the controller with an appropriate level of access to receive cluster events.
Since procfs /proc/[pid]
subdirectories are owned by the effective user and
group of the process, we require escalated capabilities so the daemon can watch
for filesystem events of any process running in your cluster, directly on the
host.
If running in a multi-tenant environment, you certainly will not be able to have the kind of escalated privileges needed to gain access to the host filesystem, for obvious reasons.
Quickstart
We provide multiple paths of configuration for both vanilla Kubernetes and OpenShift, which has additional security measures in place.
-
To deploy Argus on a vanilla Kubernetes environment, simply run an
apply
on the following hosted configuration:kubectl apply -f \ https://raw.githubusercontent.com/clustergarage/argus/master/configs/argus-k8s.yaml
This will create a
Namespace
argus under which all the components will be organized. -
OpenShift has a more opinionated security model by default. We assume a normal OpenShift install, which requires a few more components to be created. This can be done with the following command:
oc apply -f \ https://raw.githubusercontent.com/clustergarage/argus/master/configs/argus-openshift.yaml
This will create an OpenShift
Project
argus under which all components will be organized. In addition to creating aNamespace
, an OpenShift project needs variousRoleBindings
for building and pulling images via anImageStream
and for ourServiceAccount
to have an admin role reference. OpenShift also requiresSecurityContextConstraints
to properly run containers as certain users, with Linux capabilities, and gain access to the host PID namespace and volume. -
To install using a Helm chart, we provide a couple ways to do this. The simplest being an archive file included in each release:
helm install \ https://github.com/clustergarage/argus/releases/download/v0.4.0/argus-0.4.0.tgz
The other way is to add a Helm repository to your cluster and update/install using these mechanisms:
helm repo add clustergarage \ https://raw.githubusercontent.com/clustergarage/argus/master/helm/ helm repo update helm install clustergarage/argus
This will create a
Namespace
argus under which all the components will be organized.
Under the argus namespace is a ServiceAccount
used to run all items of
Argus; this service account will also get a ClusterRoleBinding
with
settings that allow the controller and daemon to be run with their required
privileges.
A CustomResourceDefinition
is included to define a custom ArgusWatcher
type housing the pod selector, paths, events, and optional flags for the
watcher.
Finally, the arguscontroller Deployment
and argusd DaemonSet
are
the core of the product. There is a headless Service
used to communicate
between the controller and all instances of the daemons.
You can verify that it installed properly by inspecting
kubectl -n argus get all
All pods should eventually converge into Running
state. The daemon pods in
particular should be running on each node that run workloads you wish to be
notified on. The controller is suggested to run with a single replica, as all
method calls to the daemon are idempotent; it can still function properly with
>1 replicas, however.
Secure Communication
To secure the gRPC communication between controller and daemon, we provide secure variations of each configurations described above.
You must provide your own certificates, which won’t be described in this documentation. The daemon will be spun up with a certificate/private key pair and an optional CA certificate as the gRPC server(s); the controller will act as a client so has additional TLS flags that can be passed in.
-
With your SSL certificates, create a ConfigMap with keys as demonstrated in the examples/ folder. This includes a
ca.pem
for root CA certificate,cert.pem
andkey.pem
for certificate/private keys. Apply the ConfigMap and the secure variation of the configuration files.kubectl apply -f \ https://raw.githubusercontent.com/clustergarage/argus/master/configs/argus-k8s-secure.yaml # add your keys to a ConfigMap kubectl apply -f argus-config.yaml
-
With your SSL certificates, create a ConfigMap with keys as demonstrated in the examples/ folder. This includes a
ca.pem
for root CA certificate,cert.pem
andkey.pem
for certificate/private keys. Apply the ConfigMap and the secure variation of the configuration files.oc apply -f \ https://raw.githubusercontent.com/clustergarage/argus/master/configs/argus-openshift-secure.yaml # add your keys to a ConfigMap oc apply -f argus-config.yaml
-
There are various limitations with Helm and providing SSL certs outside the provided chart; in particular, any files that are read need to be included inside the chart directory itself. Until this feature is added to Helm, we have to do some hacky bits first.
# download clustergarage/argus helm chart locally cp -r /path/to/ssl/certs /path/to/argus/ helm install ./argus --set tls=true
Limitations and Caveats
argusd
should be scheduled out on all nodes that can handle compute that you
would want to monitor. If your nodes have any
taints,
then there will need to be an explicit tolerance set on the DaemonSet in
order for these pods to be run on them. It will be up to you to add the
appropriate tolerance to the configuration.
Certain images (such as the official nginx) will symlink to special devices
such as /var/log/nginx/access.log -> /dev/stdout
. These devices are streams
which are symlinks to pseudo-terminals on the system and are not files that can
be monitored by inotify
. For the same reason, they are also not candidates
for the practice of file integrity monitoring.
Need Help?
If you have any issues with any of the steps to get started with running this in your cluster, please begin by checking to see if any of the issues you may be facing are included in our GitHub issues. If you suspect you may be having problems not recorded there, open a detailed issue with all steps and pertinent information about your cluster setup.
If you wish to contact us directly, use the form located on the Contact page.