Skip to content

bug: workspace object deletion seems to lack behind logicalcluster deletion by considerable margin #4012

@SimonTheLeg

Description

@SimonTheLeg

Describe the bug

Especially on larger setups, you can run into situations where even 5-10 minutes after the logicalcluster is already gone, kubectl get workspace still reports the workspace as ready and good to go.

Here is an example. We can see in monitoring that all logicalclusters are gone for minutes at this point.

Image

however if we query for them using kubectl the workspace still shows up and shows up as ready

❯ k get ws
loadtest-999   organization            Ready   https://frontproxy.kcp-loadtests.kcp.shoot.canary.k8s-hana.ondemand.com:443/clusters/root:loadtest-999   35m

Workspace (or more precisely logicalcluster)-content cannot be accessed anymore

❯ k --server https://frontproxy.kcp-loadtests.kcp.shoot.canary.k8s-hana.ondemand.com:443/clusters/root:loadtest-999 get ns bla
Error from server (Forbidden): forbidden: User "" cannot get path "/clusters/root:loadtest-999/api/v1/namespaces/bla": workspace access not permitted

Steps To Reproduce

  1. Setup kcp with metrics
  2. create a couple of thousand workspaces
  3. delete all the workspaces
  4. note how you can observe in the metrics that the logicalclusters are gone but k get workspaces still reports workspaces as ready

Expected Behaviour

after a logical cluster is gone. The workspace object should disappear swiftly as well.

Additional Context

No response

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

Projects

Status

Next

Relationships

None yet

Development

No branches or pull requests

Issue actions