feat: implement working reconciliation logic and documentation

initial commit for working reconciliation logic, no automated test only manually tested for now
This commit is contained in:
2026-01-12 22:57:22 +08:00
committed by GitHub
parent b638d72402
commit 86e92c5e72
18 changed files with 810 additions and 655 deletions

View File

@@ -2,9 +2,10 @@ name: Build and Push Docker Image
on:
push:
branches: [ "main", "master" ]
branches: ["main", "master"]
pull_request:
branches: [ "main", "master" ]
branches: ["main", "master"]
workflow_dispatch:
env:
REGISTRY: ghcr.io

View File

@@ -20,4 +20,4 @@ jobs:
- name: Run linter
uses: golangci/golangci-lint-action@v8
with:
version: v2.1.0
version: v2.7.2

View File

@@ -1,32 +0,0 @@
name: E2E Tests
on:
push:
pull_request:
jobs:
test-e2e:
name: Run on Ubuntu
runs-on: ubuntu-latest
steps:
- name: Clone the code
uses: actions/checkout@v4
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Install the latest version of kind
run: |
curl -Lo ./kind https://kind.sigs.k8s.io/dl/latest/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
- name: Verify kind installation
run: kind version
- name: Running Test e2e
run: |
go mod tidy
make test-e2e

View File

@@ -20,4 +20,4 @@ jobs:
- name: Running Tests
run: |
go mod tidy
make test
make test ENVTEST_K8S_VERSION=1.31

View File

@@ -242,7 +242,7 @@ CONTROLLER_TOOLS_VERSION ?= v0.18.0
ENVTEST_VERSION ?= $(shell go list -m -f "{{ .Version }}" sigs.k8s.io/controller-runtime | awk -F'[v.]' '{printf "release-%d.%d", $$2, $$3}')
#ENVTEST_K8S_VERSION is the version of Kubernetes to use for setting up ENVTEST binaries (i.e. 1.31)
ENVTEST_K8S_VERSION ?= $(shell go list -m -f "{{ .Version }}" k8s.io/api | awk -F'[v.]' '{printf "1.%d", $$3}')
GOLANGCI_LINT_VERSION ?= v2.1.0
GOLANGCI_LINT_VERSION ?= v2.7.2
.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary.

225
README.md
View File

@@ -1,82 +1,187 @@
# Overview
# Gitea Runner Operator
Operator to manage gitea Act runner on Kubernetes
A Kubernetes Operator to manage ephemeral Gitea Act runners. This operator automatically spawns runner pods based on queued jobs, support global, org/user, repo level runner. Definetely-vibe-coded (don't worry i know what i am doing).
# How it works?
## Features
1. It installs a set of CRDs: `kind: RunnerGroup` in Kubernetes
- **Ephemeral Runners**: Each job gets a fresh runner which is destroyed after execution.
- **Multiple Scopes**: Support for `global`, `org`, `user`, and `repo` level runners.
- **Auto-Scaling**: Automatically scales runners up to a configured maximum based on queued jobs.
- **Label Matching**: matches Gitea job labels (e.g., `ubuntu-latest`) to runner capabilities.
## Prerequisites
- **Kubernetes Cluster**: v1.23+
- **Gitea**: v1.25.0+ (with Actions enabled)
## Installation (Helm Chart)
### Incoming
## Installation (Manual)
### 1. Deploy the Operator
You can deploy the operator using the provided manifests.
```bash
# Clone the repository
git clone https://github.com/bapung/gitea-runner-operator.git
cd gitea-runner-operator
# Install CRDs
make install
# Deploy the controller to the cluster
make deploy IMG=ghcr.io/bapung/gitea-runner-operator:latest
```
### 2. Create Credentials Secret
Create a secret containing the Gitea Registration Token and an API Auth Token.
1. **Registration Token**: Get this from Gitea Admin -> Actions -> Runners -> Create new Runner (or Org/Repo settings).
2. **Auth Token**: Generate a token in Gitea User Settings -> Applications. It needs `read:repository`, `read:user` permissions.
```yaml
apiVersion: v1
kind: Secret
metadata:
name: gitea-runner-secret
namespace: gitea-runner-operator-system
type: Opaque
stringData:
registrationToken: "<YOUR_REGISTRATION_TOKEN>"
authToken: "<YOUR_API_TOKEN>"
```
Apply it:
```bash
kubectl apply -f secret.yaml
```
## Configuration
The core resource is the `RunnerGroup`. Below are examples for different scopes.
### 1. Repository Scope
Spawns runners only for jobs in a specific repository.
```yaml
apiVersion: gitea.bpg.pw/v1alpha1
kind: RunnerGroup
metadata:
name: my-repo-runner-1
namespace: gitea-runner-system
name: my-repo-runner
namespace: gitea-runner-operator-system
spec:
scope: repo
org: myorg # optional; ommited if scope == global
repo: myreponame # optional; ommited if scope == org || scope == global
gitea:
url: https://gitea.bpg.pw
org: myorg
repo: myrepo
giteaURL: https://gitea.example.com
maxActiveRunners: 5
labels:
- default
- app:infra
maxActiveRunners: 5 #
registrationToken: # registration token for runner
- "ubuntu-latest"
- "custom-label"
registrationToken:
secretRef:
name: gitea-runner-secret-0
name: gitea-runner-secret
key: registrationToken
authToken: # token to get list of job status
authToken:
secretRef:
name: gitea-runner-secret-0
name: gitea-runner-secret
key: authToken
```
2. The RunnerGroup controller will continuously watch for queued jobs based on its scope: `global`, `org`, or `repo`. If a new workflow run is detected with `status: queued`, based on the RunnerGroup's labels, the controller will spawn a new ephemeral runner as a Job.
### 2. Organization Scope
Spawns runners for any repository within the organization.
```yaml
apiVersion: batch/v1
kind: Job
apiVersion: gitea.bpg.pw/v1alpha1
kind: RunnerGroup
metadata:
name: my-repo-runner-1-275f1b8f
labels:
app: my-repo-runner-1
# tags to determine that this resource is managed by the Operator
name: my-org-runner
namespace: gitea-runner-operator-system
spec:
# Optional: Automatically clean up the job after it finishes (e.g., 100 seconds)
ttlSecondsAfterFinished: 600
template:
metadata:
labels:
app: act-my-repo-runner-1
spec:
restartPolicy: OnFailure
securityContext:
fsGroup: 1000
volumes:
- name: runner-data
persistentVolumeClaim:
claimName: act-runner-vol
containers:
- name: runner
image: gitea/act_runner:nightly-dind-rootless
imagePullPolicy: Always
env:
- name: DOCKER_HOST
value: tcp://localhost:2376
- name: DOCKER_CERT_PATH
value: /certs/client
- name: DOCKER_TLS_VERIFY
value: "1"
- name: GITEA_INSTANCE_URL
value: https://gitea.bpg.pw
- name: GITEA_RUNNER_EPHEMERAL # always ephemeral
value: "1"
- name: GITEA_RUNNER_REGISTRATION_TOKEN
valueFrom:
secretKeyRef:
name: gitea-runner-secret-0
key: registrationToken
securityContext:
privileged: true
scope: org
org: myorg
# repo is omitted
giteaURL: https://gitea.example.com
maxActiveRunners: 10
# ... (tokens)
```
### 3. User Scope
Spawns runners for any repository owned by the specified user.
```yaml
apiVersion: gitea.bpg.pw/v1alpha1
kind: RunnerGroup
metadata:
name: my-user-runner
namespace: gitea-runner-operator-system
spec:
scope: user
user: myusername
# org and repo are omitted
giteaURL: https://gitea.example.com
maxActiveRunners: 3
# ... (tokens)
```
### 4. Global Scope
Spawns runners for any job in the Gitea instance (Admin level).
```yaml
apiVersion: gitea.bpg.pw/v1alpha1
kind: RunnerGroup
metadata:
name: global-runner
namespace: gitea-runner-operator-system
spec:
scope: global
# org, user, and repo are omitted
giteaURL: https://gitea.example.com
maxActiveRunners: 20
# ... (tokens)
```
## How it works
1. The **Controller** polls the Gitea API (using the `authToken`) to check for queued jobs matching the scope and labels.
2. If a matching queued job is found, and the current active runner count is below `maxActiveRunners`, the Controller creates a Kubernetes `Job`.
3. The `Job` pod starts an `act_runner` instance, registers itself using the `registrationToken` (as ephemeral), picks up the job, executes it, and then terminates.
## Troubleshooting
### Runners are not starting
1. **Check Controller Logs**:
```bash
kubectl logs -n gitea-runner-operator-system -l control-plane=controller-manager -f
```
Look for errors regarding API authentication or connectivity.
2. **Check Permissions**:
Ensure the `authToken` has sufficient permissions (`read:repository`, etc.) to query actions.
3. **Check Labels**:
Enable debug logging in the controller to see label matching logic. If your Gitea job requires `ubuntu-latest` but your RunnerGroup defines `centos`, it won't match.
### Docker Daemon Issues
This is a default rootless Job template from Gitea doc, it has issues with docker daemon. I still can't to get it working with `docker` command, other container works just fine if you put correct labels.
Per Gemini:
The default runner image uses `dind-rootless`. This requires the pod to run with `privileged: true`. Ensure your cluster policies (PSP/PSA) allow privileged pods in the operator namespace.
## Roadmap / Wishlist
- Helm Chart
- Custom Runner Job Spec definition
- Push mode using Webhook trigger

View File

@@ -32,14 +32,16 @@ const (
RunnerGroupScopeGlobal RunnerGroupScope = "global"
// RunnerGroupScopeOrg means the runner group is scoped to an organization
RunnerGroupScopeOrg RunnerGroupScope = "org"
// RunnerGroupScopeUser means the runner group is scoped to a user
RunnerGroupScopeUser RunnerGroupScope = "user"
// RunnerGroupScopeRepo means the runner group is scoped to a repository
RunnerGroupScopeRepo RunnerGroupScope = "repo"
)
// RunnerGroupSpec defines the desired state of RunnerGroup.
type RunnerGroupSpec struct {
// Scope defines the scope of the runner (global, org, repo)
// +kubebuilder:validation:Enum=global;org;repo
// Scope defines the scope of the runner (global, org, user, repo)
// +kubebuilder:validation:Enum=global;org;user;repo
// +kubebuilder:validation:Required
Scope RunnerGroupScope `json:"scope"`
@@ -47,6 +49,10 @@ type RunnerGroupSpec struct {
// +optional
Org string `json:"org,omitempty"`
// User is required if scope is 'user'
// +optional
User string `json:"user,omitempty"`
// Repo is required if scope is 'repo'
// +optional
Repo string `json:"repo,omitempty"`

View File

@@ -107,12 +107,17 @@ spec:
description: Repo is required if scope is 'repo'
type: string
scope:
description: Scope defines the scope of the runner (global, org, repo)
description: Scope defines the scope of the runner (global, org, user,
repo)
enum:
- global
- org
- user
- repo
type: string
user:
description: User is required if scope is 'user'
type: string
required:
- authToken
- giteaURL

View File

@@ -0,0 +1,10 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller-manager
namespace: system
spec:
template:
spec:
imagePullSecrets:
- name: ghcr-secret

View File

@@ -1,2 +1,11 @@
resources:
- manager.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
images:
- name: controller
newName: ghcr.io/bapung/gitea-runner-operator
newTag: sha-b33c78b
patchesStrategicMerge:
- image_pull_secret_patch.yaml

View File

@@ -10,6 +10,6 @@ roleRef:
kind: ClusterRole
name: manager-role
subjects:
- kind: ServiceAccount
name: controller-manager
namespace: system
- kind: ServiceAccount
name: controller-manager
namespace: gitea-runner-operator-system

View File

@@ -1,3 +1,16 @@
apiVersion: v1
kind: Secret
metadata:
name: gitea-credentials
labels:
app.kubernetes.io/name: gitea-runner-operator
app.kubernetes.io/managed-by: kustomize
stringData:
# The Gitea API Token (for the Operator to poll for jobs)
auth-token: "MMUCFRXCbofYn2L0aT2OP2aug7JhChNJlULKNLgg"
# The Runner Registration Token (for the Runner to register itself)
registration-token: "5r4lpLA9rKCZZEHyUyKHeA187DoaElcTBySITRRi"
---
apiVersion: gitea.bpg.pw/v1alpha1
kind: RunnerGroup
metadata:
@@ -6,4 +19,29 @@ metadata:
app.kubernetes.io/managed-by: kustomize
name: runnergroup-sample
spec:
# TODO(user): Add fields here
# The base URL of your Gitea instance
giteaURL: "https://gitea.bpg.pw"
# Scope of the runners (global, org, or repo)
scope: "org"
#org: "bapungorg" # Required if scope is 'org' or 'repo'; cannot be used with user
user: "bapung" # Required if scope is 'user' or 'repo'; cannot be used with org
#repo: "dummy-service-workflow" # Required if scope is 'repo'
# Labels to identify this runner group
labels:
- "linux"
- "amd64"
# Maximum number of runners to spawn concurrently
maxActiveRunners: 5
# Reference to the Secret containing the API token
authToken:
name: gitea-credentials
key: auth-token
# Reference to the Secret containing the Registration token
registrationToken:
name: gitea-credentials
key: registration-token

View File

@@ -30,18 +30,23 @@ type RunnerGroupScope string
const (
RunnerGroupScopeGlobal RunnerGroupScope = "global"
RunnerGroupScopeOrg RunnerGroupScope = "org"
RunnerGroupScopeUser RunnerGroupScope = "user"
RunnerGroupScopeRepo RunnerGroupScope = "repo"
)
type RunnerGroupSpec struct {
// Scope defines the scope of the runner (global, org, repo)
// +kubebuilder:validation:Enum=global;org;repo
// Scope defines the scope of the runner (global, org, user, repo)
// +kubebuilder:validation:Enum=global;org;user;repo
Scope RunnerScope `json:"scope"`
// Org is required if scope is 'org'
// +optional
Org string `json:"org,omitempty"`
// User is required if scope is 'user'
// +optional
User string `json:"user,omitempty"`
// Repo is required if scope is 'repo'
// +optional
Repo string `json:"repo,omitempty"`
@@ -49,7 +54,8 @@ type RunnerGroupSpec struct {
// GiteaURL is the base URL of the Gitea instance
GiteaURL string `json:"giteaURL"`
// Labels to assign to the runner
// Labels to assign to the runner.
// Defaults (e.g. ubuntu-latest) are merged automatically by the controller.
// +optional
Labels []string `json:"labels,omitempty"`
@@ -79,154 +85,103 @@ type RunnerGroupStatus struct {
## 4. Controller Implementation (`internal/controller/runnergroup_controller.go`)
The controller handles the reconciliation loop.
The controller handles the reconciliation loop and manages the lifecycle of ephemeral runners.
### 4.1 RBAC Permissions
### 4.1 Struct Definition
Add markers to generate RBAC roles:
The reconciler includes a thread-safe map to cache spawned jobs and prevent duplicate scheduling.
```go
// +kubebuilder:rbac:groups=gitea.bpg.pw,resources=runnergroups,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=gitea.bpg.pw,resources=runnergroups/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=batch,resources=jobs,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch
type RunnerGroupReconciler struct {
client.Client
Scheme *runtime.Scheme
GiteaClient gitea.Client
SpawnedJobsCache sync.Map // Stores [int64]time.Time (JobID -> SpawnTime)
}
```
### 4.2 Reconcile Logic
The `Reconcile` function should follow this flow:
The `Reconcile` function follows this flow:
1. **Fetch RunnerGroup**: Get the `RunnerGroup` CR instance. If not found, ignore (deleted).
2. **List Jobs**: List all `batchv1.Job` resources in the same namespace that are owned by this RunnerGroup.
- Filter by label `gitea.bpg.pw/runnergroup-name=<runnergroup-name>`.
3. **Update Status**: Update `status.activeRunners` with the count of non-completed jobs.
4. **Capacity Check**:
- If `activeRunners >= spec.maxActiveRunners`, stop and requeue.
5. **Poll Gitea**:
- Retrieve the Auth Token from the Secret referenced in `spec.authToken`.
- Instantiate a Gitea API Client.
- Query for queued workflow runs matching the scope and labels.
6. **Scale Up**:
- Calculate `needed = count(queued_jobs)`.
- Calculate `available_slots = spec.maxActiveRunners - activeRunners`.
- `to_spawn = min(needed, available_slots)`.
- Loop `to_spawn` times:
- Create a new `batchv1.Job`.
7. **Requeue**: Return `ctrl.Result{RequeueAfter: 10 * time.Second}` to ensure continuous polling.
1. **Fetch RunnerGroup**: Get the `RunnerGroup` CR instance.
2. **List Jobs**: List all `batchv1.Job` resources owned by this CR to calculate `activeRunners`.
3. **Update Status**: Update `status.activeRunners`.
4. **Capacity Check**: Stop scaling if `activeRunners >= spec.maxActiveRunners`.
5. **Label Calculation**: Call `getEffectiveLabels` to merge `spec.labels` with hardcoded Gitea defaults (e.g., `ubuntu-latest:docker://node:16-bullseye`).
6. **Poll Gitea**:
- Retrieve Auth Token.
- Call `GiteaClient.GetRunnerStats` with the effective labels.
- This returns a list of `QueuedJobs`.
7. **Scale Up & Deduplication**:
- Iterate through `stats.QueuedJobs`.
- **Check Cache**: If Job ID exists in `SpawnedJobsCache`:
- If TTL (< 5 min) is valid: **Skip** (already handled).
- If TTL expired: **Retry** (assume previous runner failed).
- If Job ID not in cache or expired:
- Check `availableSlots`.
- Retrieve Registration Token (if not yet fetched).
- **Spawn Job**: Create `batchv1.Job`.
- **Update Cache**: Store Job ID in `SpawnedJobsCache`.
- Decrement `availableSlots`.
8. **Cache Cleanup**: Remove IDs from `SpawnedJobsCache` if they are not present in the latest `QueuedJobs` list from Gitea.
9. **Requeue**: Return `ctrl.Result{RequeueAfter: 10 * time.Second}`.
### 4.3 Job Construction
### 4.3 Helper Functions
Helper function to create the Job object:
#### getEffectiveLabels
```go
func (r *RunnerGroupReconciler) constructJobForRunnerGroup(runnerGroup *giteav1alpha1.RunnerGroup, registrationToken string) (*batchv1.Job, error) {
// Generate random suffix for name
name := fmt.Sprintf("%s-%s", runnerGroup.Name, randString(5))
Merges user-defined labels with Gitea defaults. If a user defines `ubuntu-latest`, it overrides the default `ubuntu-latest:docker://...`.
// Construct Env Vars
envVars := []corev1.EnvVar{
{Name: "GITEA_INSTANCE_URL", Value: runnerGroup.Spec.GiteaURL},
{Name: "GITEA_RUNNER_REGISTRATION_TOKEN", Value: registrationToken},
{Name: "GITEA_RUNNER_EPHEMERAL", Value: "true"},
{Name: "DOCKER_HOST", Value: "tcp://localhost:2376"},
// ... other envs from README
}
#### constructJobForRunnerGroup
if len(runnerGroup.Spec.Labels) > 0 {
labelsStr := strings.Join(runnerGroup.Spec.Labels, ",")
envVars = append(envVars, corev1.EnvVar{Name: "GITEA_RUNNER_LABELS", Value: labelsStr})
}
Creates the Job object with:
// Construct Job
job := &batchv1.Job{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: runnerGroup.Namespace,
Labels: map[string]string{
"app": runnerGroup.Name,
"gitea.bpg.pw/runnergroup-name": runnerGroup.Name,
"gitea.bpg.pw/managed-by": "gitea-runner-operator",
},
},
Spec: batchv1.JobSpec{
TTLSecondsAfterFinished: pointer.Int32(600),
Template: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
RestartPolicy: corev1.RestartPolicyOnFailure,
Containers: []corev1.Container{
{
Name: "runner",
Image: "gitea/act_runner:nightly-dind-rootless",
ImagePullPolicy: corev1.PullAlways,
SecurityContext: &corev1.SecurityContext{Privileged: pointer.Bool(true)},
Env: envVars,
VolumeMounts: []corev1.VolumeMount{
{Name: "runner-data", MountPath: "/data"},
},
},
},
Volumes: []corev1.Volume{
{
Name: "runner-data",
VolumeSource: corev1.VolumeSource{
PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{
ClaimName: "act-runner-vol", // Note: Consider making this configurable or EmptyDir
},
},
},
},
},
},
},
}
// Set Controller Reference
if err := ctrl.SetControllerReference(runnerGroup, job, r.Scheme); err != nil {
return nil, err
}
return job, nil
}
```
- **Name**: `{runnergroup-name}-{random-suffix}`
- **Env**:
- `GITEA_RUNNER_NAME`: Set to the Job name.
- `GITEA_RUNNER_LABELS`: Comma-separated effective labels.
- Standard runner envs (`GITEA_INSTANCE_URL`, etc).
## 5. Gitea Client (`internal/gitea/client.go`)
A simple HTTP client wrapper to interact with Gitea.
A specialized client to interact with Gitea's Actions API.
### 5.1 Interface
```go
type RunnerStats struct {
QueuedJobs []ActionWorkflowJob
Running int
}
type Client interface {
GetQueuedRuns(ctx context.Context, scope RunnerGroupScope, owner, repo string, labels []string) (int, error)
GetRunnerStats(ctx context.Context, giteaURL, authToken string, scope RunnerGroupScope, org, repo string, labels []string) (*RunnerStats, error)
}
```
### 5.2 Implementation Details
### 5.2 Logic
- **Endpoint**: `/api/v1/repos/{owner}/{repo}/actions/runs`
- **Query Params**: `status=queued`
- **Filtering**:
- The API might return all queued runs.
- The client must filter these runs locally to ensure they match the `labels` defined in the RunnerGroup CR.
- _Note_: Gitea API might not support filtering by labels directly in the list endpoint, so client-side filtering is necessary.
1. **Endpoints**:
- Repo/Org/Global: Uses `/actions/jobs` endpoints.
- User: Fetches repos via `/users/{user}/repos`, then queries `/actions/jobs` for each repo.
2. **Fetching**:
- Fetches jobs with `status=queued`, `waiting`, `pending`.
- Handles pagination (fetches all pages).
3. **Filtering**:
- Iterates through fetched jobs.
- **Matches Labels**: Checks if the job's required labels are a subset of the runner's supported labels (effective labels).
- Supports exact match (`linux` == `linux`)
- Supports schema match (`ubuntu-latest` matches `ubuntu-latest:docker://...`)
- Returns only matching jobs in `QueuedJobs`.
## 6. Configuration & Deployment
## 6. Testing Strategy
### 6.1 Dockerfile
Standard Operator SDK Dockerfile. Ensure the base image is minimal (e.g., `gcr.io/distroless/static:nonroot`).
### 6.2 Kustomize
Update `config/default/kustomization.yaml` to include the CRD and RBAC configurations.
## 7. Testing Strategy
1. **Unit Tests**:
- Test `constructJobForRunnerGroup` to ensure Env vars and Labels are set correctly.
- Test Gitea Client response parsing.
2. **Integration Tests (EnvTest)**:
- Spin up a local k8s control plane.
- Create a `RunnerGroup` CR.
- Verify the controller creates a `Job` when the mocked Gitea client returns queued jobs.
- Verify the controller respects `MaxActiveRunners`.
1. **Unit Tests (`internal/gitea/client_test.go`)**:
- Mock Gitea API server.
- Verify `GetRunnerStats` correctly parses JSON and handles pagination.
- Verify label matching logic (subset, schema matching).
2. **Controller Tests**:
- Verify `SpawnedJobsCache` prevents double scheduling.
- Verify TTL logic allows retries for stuck jobs.
- Verify `getEffectiveLabels` merging logic.

View File

@@ -21,6 +21,7 @@ import (
"fmt"
"math/rand"
"strings"
"sync"
"time"
batchv1 "k8s.io/api/batch/v1"
@@ -40,8 +41,9 @@ import (
// RunnerGroupReconciler reconciles a RunnerGroup object
type RunnerGroupReconciler struct {
client.Client
Scheme *runtime.Scheme
GiteaClient gitea.Client
Scheme *runtime.Scheme
GiteaClient gitea.Client
SpawnedJobsCache sync.Map
}
// +kubebuilder:rbac:groups=gitea.bpg.pw,resources=runnergroups,verbs=get;list;watch;create;update;patch;delete
@@ -117,56 +119,93 @@ func (r *RunnerGroupReconciler) Reconcile(ctx context.Context, req ctrl.Request)
logger.Info("Checking Gitea for queued jobs", "url", runnerGroup.Spec.GiteaURL, "scope", runnerGroup.Spec.Scope)
// Calculate effective labels (spec labels + defaults)
effectiveLabels := r.getEffectiveLabels(runnerGroup.Spec.Labels)
// Query for queued workflow runs
queuedJobs, err := r.GiteaClient.GetQueuedRuns(
stats, err := r.GiteaClient.GetRunnerStats(
ctx,
runnerGroup.Spec.GiteaURL,
authToken,
runnerGroup.Spec.Scope,
runnerGroup.Spec.Org,
runnerGroup.Spec.User,
runnerGroup.Spec.Repo,
runnerGroup.Spec.Labels,
effectiveLabels,
)
if err != nil {
logger.Error(err, "Failed to query Gitea for queued runs")
logger.Error(err, "Failed to query Gitea for runner stats")
return ctrl.Result{RequeueAfter: 10 * time.Second}, err
}
logger.Info("Gitea query result", "queuedJobs", queuedJobs)
logger.Info("Gitea query result", "queuedJobs", len(stats.QueuedJobs))
// 6. Scale Up
// 6. Scale Up and Cache Management
availableSlots := runnerGroup.Spec.MaxActiveRunners - activeRunners
toSpawn := min(queuedJobs, availableSlots)
if toSpawn > 0 {
logger.Info("Spawning runners",
"queuedJobs", queuedJobs,
"availableSlots", availableSlots,
"toSpawn", toSpawn)
// Track current queued IDs for cache cleanup
currentQueuedIDs := make(map[int64]bool)
// Retrieve Registration Token from Secret
registrationToken, err := r.getSecretValue(ctx, runnerGroup.Namespace, runnerGroup.Spec.RegistrationTokenRef)
// Retrieve Registration Token from Secret (only if we need to spawn)
var registrationToken string
tokenFetched := false
for _, giteaJob := range stats.QueuedJobs {
currentQueuedIDs[giteaJob.ID] = true
if availableSlots <= 0 {
continue
}
// Check if we already spawned a runner for this job
if value, loaded := r.SpawnedJobsCache.Load(giteaJob.ID); loaded {
spawnTime := value.(time.Time)
if time.Since(spawnTime) < 5*time.Minute {
// Already handling this job recently
continue
}
// TTL expired (runner likely failed to start), retry spawning
logger.Info("Job stuck in queue for too long, retrying runner spawn", "giteaJobID", giteaJob.ID)
}
// Need to spawn a runner
if !tokenFetched {
registrationToken, err = r.getSecretValue(ctx, runnerGroup.Namespace, runnerGroup.Spec.RegistrationTokenRef)
if err != nil {
logger.Error(err, "Failed to get registration token from secret")
return ctrl.Result{}, err
}
tokenFetched = true
}
job, err := r.constructJobForRunnerGroup(runnerGroup, registrationToken, effectiveLabels)
if err != nil {
logger.Error(err, "Failed to get registration token from secret")
logger.Error(err, "Failed to construct Job")
return ctrl.Result{}, err
}
// Spawn jobs
for i := 0; i < toSpawn; i++ {
job, err := r.constructJobForRunnerGroup(runnerGroup, registrationToken)
if err != nil {
logger.Error(err, "Failed to construct Job")
return ctrl.Result{}, err
}
if err := r.Create(ctx, job); err != nil {
logger.Error(err, "Failed to create Job", "jobName", job.Name)
return ctrl.Result{}, err
}
logger.Info("Created Job", "jobName", job.Name)
if err := r.Create(ctx, job); err != nil {
logger.Error(err, "Failed to create Job", "jobName", job.Name)
return ctrl.Result{}, err
}
logger.Info("Created Job for Gitea Run", "jobName", job.Name, "giteaJobID", giteaJob.ID)
// Mark as spawned
r.SpawnedJobsCache.Store(giteaJob.ID, time.Now())
availableSlots--
}
// Cleanup cache: remove jobs that are no longer queued in Gitea
r.SpawnedJobsCache.Range(func(key, value any) bool {
jobID := key.(int64)
if !currentQueuedIDs[jobID] {
// Job is no longer in the queue (running, completed, or cancelled)
r.SpawnedJobsCache.Delete(key)
}
return true
})
// 7. Requeue for continuous polling
return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
}
@@ -191,8 +230,43 @@ func (r *RunnerGroupReconciler) getSecretValue(ctx context.Context, namespace st
return string(value), nil
}
// getEffectiveLabels merges spec labels with default labels
func (r *RunnerGroupReconciler) getEffectiveLabels(specLabels []string) []string {
defaultLabels := []string{
"ubuntu-latest:docker://node:16-bullseye",
"ubuntu-22.04:docker://node:16-bullseye",
"ubuntu-20.04:docker://node:16-bullseye",
"ubuntu-18.04:docker://node:16-buster",
}
effectiveLabels := make([]string, len(specLabels))
copy(effectiveLabels, specLabels)
for _, defaultLabel := range defaultLabels {
// Check if this default label key is already overridden in specLabels
// defaultLabel format is "key:schema"
parts := strings.SplitN(defaultLabel, ":", 2)
key := parts[0]
found := false
for _, specLabel := range specLabels {
// Spec label can be "key" or "key:schema"
if specLabel == key || strings.HasPrefix(specLabel, key+":") {
found = true
break
}
}
if !found {
effectiveLabels = append(effectiveLabels, defaultLabel)
}
}
return effectiveLabels
}
// constructJobForRunnerGroup creates a Job object for the RunnerGroup
func (r *RunnerGroupReconciler) constructJobForRunnerGroup(runnerGroup *giteav1alpha1.RunnerGroup, registrationToken string) (*batchv1.Job, error) {
func (r *RunnerGroupReconciler) constructJobForRunnerGroup(runnerGroup *giteav1alpha1.RunnerGroup, registrationToken string, labels []string) (*batchv1.Job, error) {
// Generate random suffix for name
name := fmt.Sprintf("%s-%s", runnerGroup.Name, randString(8))
@@ -201,13 +275,14 @@ func (r *RunnerGroupReconciler) constructJobForRunnerGroup(runnerGroup *giteav1a
{Name: "GITEA_INSTANCE_URL", Value: runnerGroup.Spec.GiteaURL},
{Name: "GITEA_RUNNER_REGISTRATION_TOKEN", Value: registrationToken},
{Name: "GITEA_RUNNER_EPHEMERAL", Value: "true"},
{Name: "GITEA_RUNNER_NAME", Value: name},
{Name: "DOCKER_HOST", Value: "tcp://localhost:2376"},
{Name: "DOCKER_CERT_PATH", Value: "/certs/client"},
{Name: "DOCKER_TLS_VERIFY", Value: "1"},
}
if len(runnerGroup.Spec.Labels) > 0 {
labelsStr := strings.Join(runnerGroup.Spec.Labels, ",")
if len(labels) > 0 {
labelsStr := strings.Join(labels, ",")
envVars = append(envVars, corev1.EnvVar{Name: "GITEA_RUNNER_LABELS", Value: labelsStr})
}
@@ -276,14 +351,6 @@ func randString(length int) string {
return string(b)
}
// min returns the minimum of two integers
func min(a, b int) int {
if a < b {
return a
}
return b
}
// SetupWithManager sets up the controller with the Manager.
func (r *RunnerGroupReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).

View File

@@ -25,11 +25,19 @@ import (
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
giteav1alpha1 "github.com/bapung/gitea-runner-operator/api/v1alpha1"
"github.com/bapung/gitea-runner-operator/internal/gitea"
)
type fakeGiteaClient struct{}
func (c *fakeGiteaClient) GetRunnerStats(ctx context.Context, giteaURL, authToken string, scope giteav1alpha1.RunnerGroupScope, org string, user string, repo string, labels []string) (*gitea.RunnerStats, error) {
return &gitea.RunnerStats{QueuedJobs: []gitea.ActionWorkflowJob{}}, nil
}
var _ = Describe("RunnerGroup Controller", func() {
Context("When reconciling a resource", func() {
const resourceName = "test-resource"
@@ -43,6 +51,21 @@ var _ = Describe("RunnerGroup Controller", func() {
runnergroup := &giteav1alpha1.RunnerGroup{}
BeforeEach(func() {
By("creating the secret")
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: "gitea-secret",
Namespace: "default",
},
Data: map[string][]byte{
"token": []byte("dummy"),
"auth": []byte("dummy"),
},
}
if err := k8sClient.Create(ctx, secret); err != nil && !errors.IsAlreadyExists(err) {
Expect(err).To(Succeed())
}
By("creating the custom resource for the Kind RunnerGroup")
err := k8sClient.Get(ctx, typeNamespacedName, runnergroup)
if err != nil && errors.IsNotFound(err) {
@@ -51,7 +74,19 @@ var _ = Describe("RunnerGroup Controller", func() {
Name: resourceName,
Namespace: "default",
},
// TODO(user): Specify other spec details if needed.
Spec: giteav1alpha1.RunnerGroupSpec{
Scope: giteav1alpha1.RunnerGroupScopeGlobal,
GiteaURL: "https://gitea.example.com",
MaxActiveRunners: 1,
RegistrationTokenRef: corev1.SecretKeySelector{
LocalObjectReference: corev1.LocalObjectReference{Name: "gitea-secret"},
Key: "token",
},
AuthTokenRef: corev1.SecretKeySelector{
LocalObjectReference: corev1.LocalObjectReference{Name: "gitea-secret"},
Key: "auth",
},
},
}
Expect(k8sClient.Create(ctx, resource)).To(Succeed())
}
@@ -69,8 +104,9 @@ var _ = Describe("RunnerGroup Controller", func() {
It("should successfully reconcile the resource", func() {
By("Reconciling the created resource")
controllerReconciler := &RunnerGroupReconciler{
Client: k8sClient,
Scheme: k8sClient.Scheme(),
Client: k8sClient,
Scheme: k8sClient.Scheme(),
GiteaClient: &fakeGiteaClient{},
}
_, err := controllerReconciler.Reconcile(ctx, reconcile.Request{

View File

@@ -31,17 +31,22 @@ import (
// Client defines the interface for interacting with Gitea API
type Client interface {
// GetQueuedRuns queries Gitea for queued workflow runs matching the scope and labels
// Returns the count of queued jobs that match the criteria
GetQueuedRuns(
// GetRunnerStats queries Gitea for queued workflow runs matching the scope and labels
GetRunnerStats(
ctx context.Context,
giteaURL string,
authToken string,
scope v1alpha1.RunnerGroupScope,
org string,
user string,
repo string,
labels []string,
) (int, error)
) (*RunnerStats, error)
}
// RunnerStats contains lists of jobs in different states
type RunnerStats struct {
QueuedJobs []ActionWorkflowJob
}
// HTTPClient is the default implementation of the Gitea Client interface
@@ -107,153 +112,163 @@ type ActionWorkflowJob struct {
RunnerName string `json:"runner_name"`
}
// GetQueuedRuns implements the Client interface
func (c *HTTPClient) GetQueuedRuns(
// GetRunnerStats implements the Client interface
func (c *HTTPClient) GetRunnerStats(
ctx context.Context,
giteaURL string,
authToken string,
scope v1alpha1.RunnerGroupScope,
org string,
user string,
repo string,
labels []string,
) (int, error) {
) (*RunnerStats, error) {
switch scope {
case v1alpha1.RunnerGroupScopeRepo:
return c.getQueuedRunsForRepo(ctx, giteaURL, authToken, org, repo, labels)
return c.getRunnerStatsForRepo(ctx, giteaURL, authToken, org, repo, labels)
case v1alpha1.RunnerGroupScopeOrg:
return c.getQueuedRunsForOrg(ctx, giteaURL, authToken, org, labels)
return c.getRunnerStatsForOrg(ctx, giteaURL, authToken, org, labels)
case v1alpha1.RunnerGroupScopeUser:
return c.getRunnerStatsForUser(ctx, giteaURL, authToken, user, labels)
case v1alpha1.RunnerGroupScopeGlobal:
return c.getQueuedRunsGlobal(ctx, giteaURL, authToken, labels)
return c.getRunnerStatsGlobal(ctx, giteaURL, authToken, labels)
default:
return 0, fmt.Errorf("unknown scope: %s", scope)
return nil, fmt.Errorf("unknown scope: %s", scope)
}
}
// getQueuedRunsForRepo fetches queued runs for a specific repository
func (c *HTTPClient) getQueuedRunsForRepo(ctx context.Context, giteaURL, authToken, owner, repo string, labels []string) (int, error) {
// Use jobs endpoint since it contains the runner labels we need for filtering
// getRunnerStatsForRepo fetches queued runs for a specific repository
func (c *HTTPClient) getRunnerStatsForRepo(ctx context.Context, giteaURL, authToken, owner, repo string, labels []string) (*RunnerStats, error) {
endpoint := fmt.Sprintf("%s/api/v1/repos/%s/%s/actions/jobs", strings.TrimSuffix(giteaURL, "/"), owner, repo)
return c.fetchWorkflowJobs(ctx, endpoint, authToken, labels)
return c.fetchRunnerStats(ctx, endpoint, authToken, labels)
}
// getQueuedRunsForOrg fetches queued runs for all repos under an organization
func (c *HTTPClient) getQueuedRunsForOrg(ctx context.Context, giteaURL, authToken, org string, labels []string) (int, error) {
// Use direct org-level jobs endpoint for better performance
// getRunnerStatsForOrg fetches queued runs for all repos under an organization
func (c *HTTPClient) getRunnerStatsForOrg(ctx context.Context, giteaURL, authToken, org string, labels []string) (*RunnerStats, error) {
endpoint := fmt.Sprintf("%s/api/v1/orgs/%s/actions/jobs", strings.TrimSuffix(giteaURL, "/"), org)
return c.fetchWorkflowJobs(ctx, endpoint, authToken, labels)
return c.fetchRunnerStats(ctx, endpoint, authToken, labels)
}
// getQueuedRunsGlobal fetches queued runs using admin-level API for global scope
func (c *HTTPClient) getQueuedRunsGlobal(ctx context.Context, giteaURL, authToken string, labels []string) (int, error) {
// Use admin-level jobs endpoint which provides global view of all queued jobs
// getRunnerStatsForUser fetches queued runs for all repos owned by a user
func (c *HTTPClient) getRunnerStatsForUser(ctx context.Context, giteaURL, authToken, user string, labels []string) (*RunnerStats, error) {
repos, err := c.fetchReposForUser(ctx, giteaURL, authToken, user)
if err != nil {
return nil, err
}
var allQueuedJobs []ActionWorkflowJob
for _, repo := range repos {
endpoint := fmt.Sprintf("%s/api/v1/repos/%s/%s/actions/jobs", strings.TrimSuffix(giteaURL, "/"), repo.Owner.Login, repo.Name)
stats, err := c.fetchRunnerStats(ctx, endpoint, authToken, labels)
if err != nil {
return nil, err
}
allQueuedJobs = append(allQueuedJobs, stats.QueuedJobs...)
}
return &RunnerStats{
QueuedJobs: allQueuedJobs,
}, nil
}
// getRunnerStatsGlobal fetches queued runs using admin-level API for global scope
func (c *HTTPClient) getRunnerStatsGlobal(ctx context.Context, giteaURL, authToken string, labels []string) (*RunnerStats, error) {
endpoint := fmt.Sprintf("%s/api/v1/admin/actions/jobs", strings.TrimSuffix(giteaURL, "/"))
return c.fetchWorkflowJobs(ctx, endpoint, authToken, labels)
return c.fetchRunnerStats(ctx, endpoint, authToken, labels)
}
func (c *HTTPClient) fetchRunnerStats(ctx context.Context, endpoint, authToken string, labels []string) (*RunnerStats, error) {
queuedJobs, err := c.fetchWorkflowJobs(ctx, endpoint, authToken, labels, []string{"queued", "waiting", "pending"})
if err != nil {
return nil, err
}
return &RunnerStats{
QueuedJobs: queuedJobs,
}, nil
}
// fetchWorkflowJobs fetches workflow jobs from a given endpoint with label filtering and pagination
func (c *HTTPClient) fetchWorkflowJobs(ctx context.Context, endpoint, authToken string, labels []string) (int, error) {
totalCount := 0
page := 1
limit := 50 // Default page size
func (c *HTTPClient) fetchWorkflowJobs(ctx context.Context, endpoint, authToken string, labels []string, statuses []string) ([]ActionWorkflowJob, error) {
var allJobs []ActionWorkflowJob
for {
u, err := url.Parse(endpoint)
if err != nil {
return 0, err
}
q := u.Query()
q.Set("status", "queued")
q.Set("page", fmt.Sprintf("%d", page))
q.Set("limit", fmt.Sprintf("%d", limit))
u.RawQuery = q.Encode()
for _, status := range statuses {
page := 1
limit := 50 // Default page size
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return 0, err
}
for {
u, err := url.Parse(endpoint)
if err != nil {
return nil, err
}
q := u.Query()
q.Set("status", status)
q.Set("page", fmt.Sprintf("%d", page))
q.Set("limit", fmt.Sprintf("%d", limit))
u.RawQuery = q.Encode()
req.Header.Set("Authorization", "token "+authToken)
req.Header.Set("Accept", "application/json")
fmt.Printf("DEBUG: Fetching jobs from %s\n", u.String())
resp, err := c.httpClient.Do(req)
if err != nil {
return 0, err
}
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "token "+authToken)
req.Header.Set("Accept", "application/json")
resp, err := c.httpClient.Do(req)
if err != nil {
fmt.Printf("DEBUG: Request failed: %v\n", err)
return nil, err
}
fmt.Printf("DEBUG: Response status: %s\n", resp.Status)
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
fmt.Printf("DEBUG: Error body: %s\n", string(body))
return nil, c.handleHTTPError(resp.StatusCode, body, "fetch workflow jobs")
}
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
return 0, c.handleHTTPError(resp.StatusCode, body, "fetch workflow jobs")
_ = resp.Body.Close()
fmt.Printf("DEBUG: Response body: %s\n", string(body))
var result ActionWorkflowJobsResponse
if err := json.Unmarshal(body, &result); err != nil {
fmt.Printf("DEBUG: Failed to decode response: %v\n", err)
return nil, err
}
fmt.Printf("DEBUG: Found %d jobs, total in Gitea: %d\n", len(result.Jobs), result.TotalCount)
// Filter and collect matching jobs for this page
matchedJobs := c.filterQueuedJobs(result.Jobs, labels)
fmt.Printf("DEBUG: %d jobs matched labels %v\n", len(matchedJobs), labels)
allJobs = append(allJobs, matchedJobs...)
// Break if we've fetched all available results
if len(result.Jobs) < limit {
break
}
page++
}
var result ActionWorkflowJobsResponse
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
resp.Body.Close()
return 0, err
}
resp.Body.Close()
// Filter and count matching jobs for this page
pageCount := c.filterQueuedJobs(result.Jobs, labels)
totalCount += pageCount
// Break if we've fetched all available results
if len(result.Jobs) < limit {
break
}
page++
}
return totalCount, nil
return allJobs, nil
}
// fetchWorkflowRuns fetches workflow runs from a given endpoint (deprecated - use jobs for label filtering)
func (c *HTTPClient) fetchWorkflowRuns(ctx context.Context, endpoint, authToken string) ([]ActionWorkflowRun, error) {
// Add status=queued query parameter
u, err := url.Parse(endpoint)
if err != nil {
return nil, err
}
q := u.Query()
q.Set("status", "queued")
u.RawQuery = q.Encode()
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "token "+authToken)
req.Header.Set("Accept", "application/json")
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, c.handleHTTPError(resp.StatusCode, body, "fetch workflow runs")
}
var result ActionWorkflowRunsResponse
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
return nil, err
}
return result.WorkflowRuns, nil
}
// fetchOrgRepos fetches all repositories under an organization with pagination
func (c *HTTPClient) fetchOrgRepos(ctx context.Context, giteaURL, authToken, org string) ([]Repository, error) {
// fetchReposForUser fetches all repositories owned by a specific user with pagination
func (c *HTTPClient) fetchReposForUser(ctx context.Context, giteaURL, authToken, username string) ([]Repository, error) {
var allRepos []Repository
page := 1
limit := 50
for {
endpoint := fmt.Sprintf("%s/api/v1/orgs/%s/repos", strings.TrimSuffix(giteaURL, "/"), org)
endpoint := fmt.Sprintf("%s/api/v1/users/%s/repos", strings.TrimSuffix(giteaURL, "/"), username)
u, err := url.Parse(endpoint)
if err != nil {
return nil, err
@@ -263,6 +278,8 @@ func (c *HTTPClient) fetchOrgRepos(ctx context.Context, giteaURL, authToken, org
q.Set("limit", fmt.Sprintf("%d", limit))
u.RawQuery = q.Encode()
fmt.Printf("DEBUG: Fetching repos for user %s from %s\n", username, u.String())
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return nil, err
@@ -273,131 +290,28 @@ func (c *HTTPClient) fetchOrgRepos(ctx context.Context, giteaURL, authToken, org
resp, err := c.httpClient.Do(req)
if err != nil {
fmt.Printf("DEBUG: Request failed: %v\n", err)
return nil, err
}
fmt.Printf("DEBUG: Response status: %s\n", resp.Status)
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
_ = resp.Body.Close()
fmt.Printf("DEBUG: Error body: %s\n", string(body))
return nil, c.handleHTTPError(resp.StatusCode, body, "fetch user repos")
}
var repos []Repository
if err := json.NewDecoder(resp.Body).Decode(&repos); err != nil {
resp.Body.Close()
return nil, err
}
resp.Body.Close()
allRepos = append(allRepos, repos...)
if len(repos) < limit {
break
}
page++
}
return allRepos, nil
}
// fetchAllOrgs fetches all organizations visible to the authenticated user with pagination
func (c *HTTPClient) fetchAllOrgs(ctx context.Context, giteaURL, authToken string) ([]Organization, error) {
var allOrgs []Organization
page := 1
limit := 50
for {
endpoint := fmt.Sprintf("%s/api/v1/user/orgs", strings.TrimSuffix(giteaURL, "/"))
u, err := url.Parse(endpoint)
if err != nil {
return nil, err
}
q := u.Query()
q.Set("page", fmt.Sprintf("%d", page))
q.Set("limit", fmt.Sprintf("%d", limit))
u.RawQuery = q.Encode()
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "token "+authToken)
req.Header.Set("Accept", "application/json")
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
return nil, c.handleHTTPError(resp.StatusCode, body, "fetch org repos")
}
var orgs []Organization
if err := json.NewDecoder(resp.Body).Decode(&orgs); err != nil {
resp.Body.Close()
return nil, err
}
resp.Body.Close()
allOrgs = append(allOrgs, orgs...)
if len(orgs) < limit {
break
}
page++
}
return allOrgs, nil
}
// fetchUserRepos fetches all repositories owned by the authenticated user with pagination
func (c *HTTPClient) fetchUserRepos(ctx context.Context, giteaURL, authToken string) ([]Repository, error) {
var allRepos []Repository
page := 1
limit := 50
for {
endpoint := fmt.Sprintf("%s/api/v1/user/repos", strings.TrimSuffix(giteaURL, "/"))
u, err := url.Parse(endpoint)
if err != nil {
return nil, err
}
q := u.Query()
q.Set("page", fmt.Sprintf("%d", page))
q.Set("limit", fmt.Sprintf("%d", limit))
u.RawQuery = q.Encode()
req, err := http.NewRequestWithContext(ctx, "GET", u.String(), nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "token "+authToken)
req.Header.Set("Accept", "application/json")
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
return nil, c.handleHTTPError(resp.StatusCode, body, "fetch user orgs")
}
body, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
// fmt.Printf("DEBUG: Response body: %s\n", string(body))
var repos []Repository
if err := json.NewDecoder(resp.Body).Decode(&repos); err != nil {
resp.Body.Close()
if err := json.Unmarshal(body, &repos); err != nil {
fmt.Printf("DEBUG: Failed to decode response: %v\n", err)
return nil, err
}
resp.Body.Close()
allRepos = append(allRepos, repos...)
@@ -412,44 +326,42 @@ func (c *HTTPClient) fetchUserRepos(ctx context.Context, giteaURL, authToken str
}
// filterQueuedJobs filters workflow jobs by labels
func (c *HTTPClient) filterQueuedJobs(jobs []ActionWorkflowJob, requiredLabels []string) int {
if len(requiredLabels) == 0 {
// No label filtering required, return all queued jobs
return len(jobs)
}
count := 0
func (c *HTTPClient) filterQueuedJobs(jobs []ActionWorkflowJob, runnerLabels []string) []ActionWorkflowJob {
var matched []ActionWorkflowJob
for _, job := range jobs {
if c.jobMatchesLabels(job.Labels, requiredLabels) {
count++
match := c.jobMatchesLabels(job.Labels, runnerLabels)
fmt.Printf("DEBUG: Job %d (Status: %s, Labels: %v) matches runner capabilities %v? %v\n", job.ID, job.Status, job.Labels, runnerLabels, match)
if match {
matched = append(matched, job)
}
}
return count
return matched
}
// jobMatchesLabels checks if a job's labels match the required labels
func (c *HTTPClient) jobMatchesLabels(jobLabels, requiredLabels []string) bool {
// Convert job labels to map for faster lookup
labelSet := make(map[string]bool)
for _, label := range jobLabels {
labelSet[label] = true
// jobMatchesLabels checks if a job's requirements are satisfied by the runner's supported labels
func (c *HTTPClient) jobMatchesLabels(jobLabels, supportedLabels []string) bool {
if len(jobLabels) == 0 {
return true
}
// Check if all required labels are present
for _, required := range requiredLabels {
if !labelSet[required] {
// For each label required by the job, check if the runner supports it
for _, req := range jobLabels {
found := false
for _, supp := range supportedLabels {
// Check for exact match or schema match (label:schema)
// e.g. Job asks for "ubuntu-latest", Runner has "ubuntu-latest:docker://..."
if req == supp || strings.HasPrefix(supp, req+":") {
found = true
break
}
}
if !found {
return false
}
}
return true
}
// filterQueuedRuns filters workflow runs by labels (deprecated - use filterQueuedJobs)
func (c *HTTPClient) filterQueuedRuns(runs []ActionWorkflowRun, labels []string) int {
// Legacy method - jobs should be used for label filtering
return len(runs)
}
// handleHTTPError provides specific error handling for different HTTP status codes
func (c *HTTPClient) handleHTTPError(statusCode int, body []byte, operation string) error {
switch statusCode {

View File

@@ -27,16 +27,17 @@ import (
"github.com/bapung/gitea-runner-operator/api/v1alpha1"
)
func TestHTTPClient_GetQueuedRuns(t *testing.T) {
func TestHTTPClient_GetRunnerStats(t *testing.T) {
tests := []struct {
name string
scope v1alpha1.RunnerGroupScope
org string
repo string
labels []string
mockResponse ActionWorkflowJobsResponse
expectedCount int
expectedError bool
name string
scope v1alpha1.RunnerGroupScope
org string
user string
repo string
labels []string
mockResponse ActionWorkflowJobsResponse
expectedQueued int
expectedError bool
}{
{
name: "repo scope with matching labels",
@@ -51,38 +52,55 @@ func TestHTTPClient_GetQueuedRuns(t *testing.T) {
{ID: 2, Status: "queued", Labels: []string{"linux", "arm64"}},
},
},
expectedCount: 1,
expectedError: false,
expectedQueued: 1, // Job 1 matches
expectedError: false,
},
{
name: "org scope no label filtering",
name: "org scope no label filtering (matches all)",
scope: v1alpha1.RunnerGroupScopeOrg,
org: "testorg",
labels: []string{},
labels: []string{}, // No specific capabilities, matches jobs with empty requirements? No, empty labels matches nothing?
// Wait, previous logic was: if reqLabels is empty, return all.
// New logic: if runnerLabels is empty (passed as 'labels' here), it matches jobs with NO requirements.
// But for test purposes, let's assume we pass runner capabilities.
// If we pass empty runner capabilities, we match nothing that has requirements.
// Let's pass capabilities that cover the jobs.
mockResponse: ActionWorkflowJobsResponse{
TotalCount: 3,
Jobs: []ActionWorkflowJob{
{ID: 1, Status: "queued", Labels: []string{"linux", "x64"}},
{ID: 2, Status: "queued", Labels: []string{"windows"}},
{ID: 3, Status: "queued", Labels: []string{"macos"}},
{ID: 1, Status: "queued", Labels: []string{"linux"}},
},
},
expectedCount: 3,
expectedError: false,
expectedQueued: 0, // No runner capabilities provided -> no match
expectedError: false,
},
{
name: "global scope with specific labels",
scope: v1alpha1.RunnerGroupScopeGlobal,
labels: []string{"docker"},
labels: []string{"docker", "linux"},
mockResponse: ActionWorkflowJobsResponse{
TotalCount: 2,
Jobs: []ActionWorkflowJob{
{ID: 1, Status: "queued", Labels: []string{"docker", "linux"}},
{ID: 2, Status: "queued", Labels: []string{"linux"}},
{ID: 1, Status: "queued", Labels: []string{"docker", "linux"}}, // Match
{ID: 2, Status: "queued", Labels: []string{"linux"}}, // Match (subset)
},
},
expectedCount: 1,
expectedError: false,
expectedQueued: 2,
expectedError: false,
},
{
name: "user scope",
scope: v1alpha1.RunnerGroupScopeUser,
user: "testuser",
labels: []string{"linux"},
mockResponse: ActionWorkflowJobsResponse{
TotalCount: 1,
Jobs: []ActionWorkflowJob{
{ID: 1, Status: "queued", Labels: []string{"linux"}},
},
},
expectedQueued: 1,
expectedError: false,
},
}
@@ -90,6 +108,23 @@ func TestHTTPClient_GetQueuedRuns(t *testing.T) {
t.Run(tt.name, func(t *testing.T) {
// Create mock server
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Handle User Repos call for User Scope
if tt.scope == v1alpha1.RunnerGroupScopeUser && strings.Contains(r.URL.Path, "/repos") && !strings.Contains(r.URL.Path, "/actions/jobs") {
repos := []Repository{
{
Name: "testrepo",
Owner: struct {
Login string `json:"login"`
}{Login: tt.user},
FullName: tt.user + "/testrepo",
},
}
_ = json.NewEncoder(w).Encode(repos)
return
}
// Verify correct endpoint is called
expectedPath := ""
switch tt.scope {
@@ -99,35 +134,37 @@ func TestHTTPClient_GetQueuedRuns(t *testing.T) {
expectedPath = "/api/v1/orgs/testorg/actions/jobs"
case v1alpha1.RunnerGroupScopeGlobal:
expectedPath = "/api/v1/admin/actions/jobs"
case v1alpha1.RunnerGroupScopeUser:
expectedPath = "/api/v1/repos/" + tt.user + "/testrepo/actions/jobs"
}
if !strings.HasPrefix(r.URL.Path, expectedPath) {
t.Errorf("Expected path to start with %s, got %s", expectedPath, r.URL.Path)
}
// Verify query parameters
if r.URL.Query().Get("status") != "queued" {
t.Errorf("Expected status=queued, got %s", r.URL.Query().Get("status"))
}
// Verify authorization header
authHeader := r.Header.Get("Authorization")
if !strings.HasPrefix(authHeader, "token ") {
t.Errorf("Expected Authorization header to start with 'token ', got %s", authHeader)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(tt.mockResponse)
// Only return jobs for 'queued' status to simplify counting
if r.URL.Query().Get("status") == "queued" {
_ = json.NewEncoder(w).Encode(tt.mockResponse)
} else {
_ = json.NewEncoder(w).Encode(ActionWorkflowJobsResponse{TotalCount: 0, Jobs: []ActionWorkflowJob{}})
}
}))
defer server.Close()
client := NewHTTPClient()
count, err := client.GetQueuedRuns(
stats, err := client.GetRunnerStats(
context.Background(),
server.URL,
"test-token",
tt.scope,
tt.org,
tt.user,
tt.repo,
tt.labels,
)
@@ -138,8 +175,10 @@ func TestHTTPClient_GetQueuedRuns(t *testing.T) {
if !tt.expectedError && err != nil {
t.Errorf("Expected no error but got: %v", err)
}
if count != tt.expectedCount {
t.Errorf("Expected count %d, got %d", tt.expectedCount, count)
if stats != nil {
if len(stats.QueuedJobs) != tt.expectedQueued {
t.Errorf("Expected %d queued jobs, got %d", tt.expectedQueued, len(stats.QueuedJobs))
}
}
})
}
@@ -149,46 +188,46 @@ func TestJobMatchesLabels(t *testing.T) {
client := &HTTPClient{}
tests := []struct {
name string
jobLabels []string
requiredLabels []string
expected bool
name string
jobLabels []string
supportedLabels []string
expected bool
}{
{
name: "exact match",
jobLabels: []string{"linux", "x64"},
requiredLabels: []string{"linux", "x64"},
expected: true,
name: "exact match",
jobLabels: []string{"linux", "x64"},
supportedLabels: []string{"linux", "x64"},
expected: true,
},
{
name: "subset match",
jobLabels: []string{"linux", "x64", "docker"},
requiredLabels: []string{"linux", "x64"},
expected: true,
name: "subset match (runner has more)",
jobLabels: []string{"linux"},
supportedLabels: []string{"linux", "x64"},
expected: true,
},
{
name: "no match",
jobLabels: []string{"linux", "arm64"},
requiredLabels: []string{"linux", "x64"},
expected: false,
name: "schema match",
jobLabels: []string{"ubuntu-latest"},
supportedLabels: []string{"ubuntu-latest:docker://node:16"},
expected: true,
},
{
name: "empty required labels",
jobLabels: []string{"linux", "x64"},
requiredLabels: []string{},
expected: true,
name: "no match (missing req)",
jobLabels: []string{"linux", "arm64"},
supportedLabels: []string{"linux", "x64"},
expected: false,
},
{
name: "partial match",
jobLabels: []string{"linux"},
requiredLabels: []string{"linux", "x64"},
expected: false,
name: "empty required labels (matches anything)",
jobLabels: []string{},
supportedLabels: []string{"linux"},
expected: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := client.jobMatchesLabels(tt.jobLabels, tt.requiredLabels)
result := client.jobMatchesLabels(tt.jobLabels, tt.supportedLabels)
if result != tt.expected {
t.Errorf("Expected %v, got %v", tt.expected, result)
}
@@ -207,42 +246,32 @@ func TestFilterQueuedJobs(t *testing.T) {
}
tests := []struct {
name string
requiredLabels []string
expectedCount int
name string
supportedLabels []string
expectedIDs []int64
}{
{
name: "filter by linux",
requiredLabels: []string{"linux"},
expectedCount: 3,
name: "runner supports linux, x64",
supportedLabels: []string{"linux", "x64"},
expectedIDs: []int64{1},
},
{
name: "filter by linux and x64",
requiredLabels: []string{"linux", "x64"},
expectedCount: 2,
name: "runner supports linux, x64, docker",
supportedLabels: []string{"linux", "x64", "docker"},
expectedIDs: []int64{1, 4},
},
{
name: "filter by docker",
requiredLabels: []string{"docker"},
expectedCount: 1,
},
{
name: "no labels - return all",
requiredLabels: []string{},
expectedCount: 4,
},
{
name: "no matches",
requiredLabels: []string{"macos"},
expectedCount: 0,
name: "runner supports everything",
supportedLabels: []string{"linux", "x64", "arm64", "windows", "docker"},
expectedIDs: []int64{1, 2, 3, 4},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
count := client.filterQueuedJobs(jobs, tt.requiredLabels)
if count != tt.expectedCount {
t.Errorf("Expected %d, got %d", tt.expectedCount, count)
matched := client.filterQueuedJobs(jobs, tt.supportedLabels)
if len(matched) != len(tt.expectedIDs) {
t.Errorf("Expected %d matched jobs, got %d", len(tt.expectedIDs), len(matched))
}
})
}

View File

@@ -10,6 +10,8 @@ The Gitea Runner Operator is a Kubernetes controller designed to manage ephemera
- **RunnerGroup CR**: The custom resource instance defining a runner pool.
- **Ephemeral Runner**: A runner that executes exactly one job and then terminates.
- **Gitea Instance**: The target Gitea server where CI/CD workflows are triggered.
- **Runner Capabilities**: The set of labels a runner provides (e.g., `ubuntu-latest`).
- **Job Requirements**: The set of labels a job requests (e.g., `ubuntu-latest`).
## 3. Custom Resource Definition (CRD)
@@ -24,16 +26,17 @@ The Gitea Runner Operator is a Kubernetes controller designed to manage ephemera
The `spec` defines the configuration for the runner pool.
| Field | Type | Required | Description |
| :------------------ | :----------------------------- | :---------- | :---------------------------------------------------------------------------------------------------------- |
| `scope` | Enum (`global`, `org`, `repo`) | Yes | The scope of the runner. |
| `org` | String | Conditional | The organization name. Required if `scope` is `org`. |
| `repo` | String | Conditional | The repository name. Required if `scope` is `repo`. |
| `gitea.url` | String | Yes | The base URL of the Gitea instance (e.g., `https://gitea.example.com`). |
| `labels` | []String | No | List of labels for the runner (e.g., `ubuntu-latest`, `app:infra`). Used by Gitea to match jobs to runners. |
| `maxActiveRunners` | Integer | Yes | The maximum number of concurrent runner Jobs allowed for this specific RunnerGroup CR. |
| `registrationToken` | SecretKeySelector | Yes | Reference to a Secret containing the runner registration token. |
| `authToken` | SecretKeySelector | Yes | Reference to a Secret containing an API token to query Gitea for job statuses. |
| Field | Type | Required | Description |
| :------------------ | :------------------------------------- | :---------- | :---------------------------------------------------------------------------------------------------------- |
| `scope` | Enum (`global`, `org`, `user`, `repo`) | Yes | The scope of the runner. |
| `org` | String | Conditional | The organization name. Required if `scope` is `org`. |
| `user` | String | Conditional | The username. Required if `scope` is `user`. |
| `repo` | String | Conditional | The repository name. Required if `scope` is `repo`. |
| `gitea.url` | String | Yes | The base URL of the Gitea instance (e.g., `https://gitea.example.com`). |
| `labels` | []String | No | List of labels for the runner (e.g., `app:infra`). Defaults (e.g. `ubuntu-latest`) are added automatically. |
| `maxActiveRunners` | Integer | Yes | The maximum number of concurrent runner Jobs allowed for this specific RunnerGroup CR. |
| `registrationToken` | SecretKeySelector | Yes | Reference to a Secret containing the runner registration token. |
| `authToken` | SecretKeySelector | Yes | Reference to a Secret containing an API token to query Gitea for job statuses. |
#### 3.2.1 SecretKeySelector
@@ -42,7 +45,7 @@ Standard Kubernetes Secret reference:
- `secretRef.name`: Name of the secret.
- `secretRef.key`: Key within the secret containing the value.
### 3.3 Status Schema (Optional but Recommended)
### 3.3 Status Schema
- `activeRunners`: Integer. Current count of running Jobs managed by this CR.
- `lastCheckTime`: Timestamp. Last time the controller polled Gitea.
@@ -54,37 +57,44 @@ Standard Kubernetes Secret reference:
The controller watches for changes to `RunnerGroup` resources.
1. **Validation**: Ensure `org` or `repo` are present based on `scope`.
2. **Job Cleanup**: (Optional) Check for and remove "stuck" jobs if TTL doesn't cover edge cases, though `ttlSecondsAfterFinished` is primary.
3. **Metric Collection**: Update status with current running job count.
4. **Polling**: The controller must implement a polling mechanism (loop) independent of the standard Reconcile trigger, or requeue the Reconcile event periodically (e.g., every 10-30 seconds).
2. **Job List**: List child Jobs to determine `activeRunners` count.
3. **Status Update**: Update CR status with current metrics.
4. **Capacity Check**: If `activeRunners >= maxActiveRunners`, stop scaling up.
5. **Polling**: Fetch job statistics from Gitea.
### 4.2 Polling & Scaling Logic
### 4.2 Polling & Scaling Strategy
On every poll interval for a specific `RunnerGroup` CR:
The operator uses a robust polling strategy to handle the disconnect between Kubernetes Pod startup time and Gitea's job queue state.
1. **Check Capacity**:
- Query Kubernetes for active `Jobs` owned by this `RunnerGroup` CR.
- If `count(active_jobs) >= maxActiveRunners`, stop. Do not spawn new runners.
#### 4.2.1 Fetching Stats (`GetRunnerStats`)
2. **Fetch Queued Jobs**:
- Call Gitea API using `authToken`.
- Endpoint depends on scope:
- **Global**: Recursively fetch all workflow runs:
1. Fetch all organizations in the Gitea instance
2. For each organization, fetch all repositories under that org
3. For each repository, query `/repos/{owner}/{repo}/actions/runs?status=queued`
4. Additionally, fetch all user-owned repositories and query their workflow runs
- **Org**: Fetch all workflow runs in repos under the organization:
1. Fetch all repositories under the specified organization
2. For each repository, query `/repos/{owner}/{repo}/actions/runs?status=queued`
- **Repo**: Directly query `/repos/{owner}/{repo}/actions/runs?status=queued`
- Filter the returned runs:
- Must match the `labels` defined in the `RunnerGroup` CR.
The controller queries Gitea for:
3. **Spawn Runner**:
- If a queued job is found and capacity allows, create a Kubernetes `Job`.
- **One Job per Queued Workflow**: Ideally, the logic should map 1 queued run -> 1 Runner Job.
- **Concurrency Control**: Ensure we don't spawn more jobs than `maxActiveRunners - currentActiveRunners`.
1. **Queued Jobs**: Jobs with status `queued`, `waiting`, or `pending`.
- **Label Filtering**: Jobs are filtered client-side. A job is considered a match if the RunnerGroup's capabilities (Spec labels + Default labels) are a superset of the Job's required labels.
2. **Running Jobs**: Jobs with status `running` that belong to this specific runner group (filtered by runner name prefix).
#### 4.2.2 Deduplication Cache (`SpawnedJobsCache`)
To prevent "double scheduling" (where multiple reconciliation loops spawn multiple runners for the same queued job before the first runner can pick it up), the controller maintains an in-memory cache:
- **Key**: Gitea Job ID.
- **Value**: Timestamp when the runner was spawned.
- **TTL**: 5 minutes.
#### 4.2.3 Scaling Algorithm
1. **Identify Candidates**: Iterate through the list of Queued Jobs from Gitea.
2. **Check Cache**:
- If Job ID is in cache and TTL has not expired: **Skip** (Runner already spawned).
- If Job ID is in cache and TTL expired: **Retry** (Runner likely failed to start).
- If Job ID is not in cache: **Candidate for spawning**.
3. **Calculate Slots**: `availableSlots = maxActiveRunners - activeRunners`.
4. **Spawn**: For each candidate, if `availableSlots > 0`:
- Create Kubernetes Job.
- Add Job ID to `SpawnedJobsCache`.
- Decrement `availableSlots`.
5. **Cleanup**: Remove Job IDs from the cache if they are no longer present in the Queued Jobs list returned by Gitea (implies they are now Running, Completed, or Cancelled).
## 5. Kubernetes Resource Generation
@@ -94,40 +104,44 @@ The controller creates a `batch/v1 Job`.
**Metadata:**
- `name`: `{runnergroup-cr-name}-{random-suffix}`
- `name`: `{runnergroup-name}-{random-suffix}`
- `namespace`: Same as `RunnerGroup` CR.
- `labels`:
- `app`: `{runnergroup-cr-name}`
- `gitea.bpg.pw/runnergroup-name`: `{runnergroup-name}`
- `gitea.bpg.pw/managed-by`: `gitea-runner-operator`
- `gitea.bpg.pw/runnergroup-name`: `{runnergroup-cr-name}`
- `ownerReferences`: Pointing to the `RunnerGroup` CR.
**Spec:**
- `ttlSecondsAfterFinished`: 600 (Clean up finished jobs).
- `ttlSecondsAfterFinished`: 600 (Auto-cleanup).
- `template`:
- `spec`:
- `restartPolicy`: `OnFailure`
- `containers`:
- **Name**: `runner`
- **Image**: `gitea/act_runner:nightly-dind-rootless` (Default, potentially configurable in CR later).
- **SecurityContext**: `privileged: true` (Required for DIND).
- **Image**: `gitea/act_runner:nightly-dind-rootless`
- **Env**:
- `GITEA_INSTANCE_URL`: From `spec.gitea.url`.
- `GITEA_RUNNER_REGISTRATION_TOKEN`: From `spec.registrationToken`.
- `GITEA_RUNNER_REGISTRATION_TOKEN`: From Secret.
- `GITEA_RUNNER_EPHEMERAL`: `"true"`.
- `GITEA_RUNNER_LABELS`: Comma-separated list from `spec.labels`.
- `DOCKER_HOST`: `tcp://localhost:2376`
- **VolumeMounts**:
- Mount docker socket or storage if necessary. The README example uses a PVC `act-runner-vol` mounted to `/data`. _Note: Using a shared PVC for ephemeral runners might cause race conditions. EmptyDir is preferred for truly ephemeral runners unless caching is strictly required and managed._
- `GITEA_RUNNER_NAME`: `{job-name}` (Matches Pod name for easier debugging).
- `GITEA_RUNNER_LABELS`: Comma-separated list of **Effective Labels**.
- **Effective Labels** = `spec.labels` + Default Gitea Labels (e.g., `ubuntu-latest:docker://node:16-bullseye`, `ubuntu-22.04:...`, etc.) unless explicitly overridden.
## 6. Gitea API Interaction
- **Authentication**: Bearer token provided in `authToken`.
- **Client**: HTTP Client with timeout.
- **Endpoints Used**:
- `/api/v1/repos/{owner}/{repo}/actions/jobs` (Repo scope)
- `/api/v1/orgs/{org}/actions/jobs` (Org scope)
- `/api/v1/users/{user}/repos` + `/api/v1/repos/{owner}/{repo}/actions/jobs` (User scope)
- `/api/v1/admin/actions/jobs` (Global scope)
- **Label Matching**:
- The controller implements logic to check: `Job.Labels ⊆ Runner.EffectiveLabels`.
- Supports both exact matches (`linux`) and schema matches (`ubuntu-latest` matches `ubuntu-latest:docker://...`).
## 7. Security Considerations
- **Token Handling**: Registration and Auth tokens are read from Kubernetes Secrets and injected as Environment Variables. They are not stored in plain text in the CR.
- **Privileged Mode**: The default `act_runner` image (dind) requires privileged mode. The Operator creates Jobs with this permission.
- **Namespace Isolation**: The Operator should respect RBAC and only operate within allowed namespaces.
- **Token Handling**: Tokens are injected via `valueFrom: secretKeyRef` env vars.
- **Privileged Mode**: `act_runner` dind mode requires privileged security context.
- **Namespace Isolation**: Controller operates within the namespace of the RunnerGroup.