revisit checks by butonic · Pull Request #2518 · opencloud-eu/opencloud

I saw that pods were marked as not ready in k8s when running load tests. So long that kubernetes decides to kill them. That is a problem if the shutdown procedure does not persist all data. See #2282

It might be that the share persisting in opencloud-eu/reva#567 eats too much cpu cyles? Other services are running in the same process, they also eat cpu cycles. Maybe that starves the check handlers? Some http or grpc queue might be full?

health checks should only verify the process is alive. a simple 200 OK response is enough for that.

the readyness check can check if the process is ready to serve traffick.

under high load kubernetes will stop sendig it traffick if the readyness probe fails. but it will only kill the pod if the healthy probe fails.

This needs more investigation.