initial commit for working reconciliation logic, no automated test only manually tested for now
7.6 KiB
Gitea Runner Operator Specification
1. Overview
The Gitea Runner Operator is a Kubernetes controller designed to manage ephemeral Gitea Act runners. It automates the provisioning of runner pods based on the demand of queued jobs in a Gitea instance. By defining RunnerGroup resources, users can configure pools of runners with specific scopes (global, organization, or repository) and labels.
2. Terminology
- CRD: Custom Resource Definition.
- RunnerGroup CR: The custom resource instance defining a runner pool.
- Ephemeral Runner: A runner that executes exactly one job and then terminates.
- Gitea Instance: The target Gitea server where CI/CD workflows are triggered.
- Runner Capabilities: The set of labels a runner provides (e.g.,
ubuntu-latest). - Job Requirements: The set of labels a job requests (e.g.,
ubuntu-latest).
3. Custom Resource Definition (CRD)
3.1 Metadata
- Group:
gitea.bpg.pw - Version:
v1alpha1 - Kind:
RunnerGroup - Scope: Namespaced
3.2 Spec Schema
The spec defines the configuration for the runner pool.
| Field | Type | Required | Description |
|---|---|---|---|
scope |
Enum (global, org, user, repo) |
Yes | The scope of the runner. |
org |
String | Conditional | The organization name. Required if scope is org. |
user |
String | Conditional | The username. Required if scope is user. |
repo |
String | Conditional | The repository name. Required if scope is repo. |
gitea.url |
String | Yes | The base URL of the Gitea instance (e.g., https://gitea.example.com). |
labels |
[]String | No | List of labels for the runner (e.g., app:infra). Defaults (e.g. ubuntu-latest) are added automatically. |
maxActiveRunners |
Integer | Yes | The maximum number of concurrent runner Jobs allowed for this specific RunnerGroup CR. |
registrationToken |
SecretKeySelector | Yes | Reference to a Secret containing the runner registration token. |
authToken |
SecretKeySelector | Yes | Reference to a Secret containing an API token to query Gitea for job statuses. |
3.2.1 SecretKeySelector
Standard Kubernetes Secret reference:
secretRef.name: Name of the secret.secretRef.key: Key within the secret containing the value.
3.3 Status Schema
activeRunners: Integer. Current count of running Jobs managed by this CR.lastCheckTime: Timestamp. Last time the controller polled Gitea.
4. Controller Logic
4.1 Reconciliation Loop
The controller watches for changes to RunnerGroup resources.
- Validation: Ensure
orgorrepoare present based onscope. - Job List: List child Jobs to determine
activeRunnerscount. - Status Update: Update CR status with current metrics.
- Capacity Check: If
activeRunners >= maxActiveRunners, stop scaling up. - Polling: Fetch job statistics from Gitea.
4.2 Polling & Scaling Strategy
The operator uses a robust polling strategy to handle the disconnect between Kubernetes Pod startup time and Gitea's job queue state.
4.2.1 Fetching Stats (GetRunnerStats)
The controller queries Gitea for:
- Queued Jobs: Jobs with status
queued,waiting, orpending.- Label Filtering: Jobs are filtered client-side. A job is considered a match if the RunnerGroup's capabilities (Spec labels + Default labels) are a superset of the Job's required labels.
- Running Jobs: Jobs with status
runningthat belong to this specific runner group (filtered by runner name prefix).
4.2.2 Deduplication Cache (SpawnedJobsCache)
To prevent "double scheduling" (where multiple reconciliation loops spawn multiple runners for the same queued job before the first runner can pick it up), the controller maintains an in-memory cache:
- Key: Gitea Job ID.
- Value: Timestamp when the runner was spawned.
- TTL: 5 minutes.
4.2.3 Scaling Algorithm
- Identify Candidates: Iterate through the list of Queued Jobs from Gitea.
- Check Cache:
- If Job ID is in cache and TTL has not expired: Skip (Runner already spawned).
- If Job ID is in cache and TTL expired: Retry (Runner likely failed to start).
- If Job ID is not in cache: Candidate for spawning.
- Calculate Slots:
availableSlots = maxActiveRunners - activeRunners. - Spawn: For each candidate, if
availableSlots > 0:- Create Kubernetes Job.
- Add Job ID to
SpawnedJobsCache. - Decrement
availableSlots.
- Cleanup: Remove Job IDs from the cache if they are no longer present in the Queued Jobs list returned by Gitea (implies they are now Running, Completed, or Cancelled).
5. Kubernetes Resource Generation
5.1 Job Specification
The controller creates a batch/v1 Job.
Metadata:
name:{runnergroup-name}-{random-suffix}namespace: Same asRunnerGroupCR.labels:gitea.bpg.pw/runnergroup-name:{runnergroup-name}gitea.bpg.pw/managed-by:gitea-runner-operator
ownerReferences: Pointing to theRunnerGroupCR.
Spec:
ttlSecondsAfterFinished: 600 (Auto-cleanup).template:spec:restartPolicy:OnFailurecontainers:- Name:
runner - Image:
gitea/act_runner:nightly-dind-rootless - Env:
GITEA_INSTANCE_URL: Fromspec.gitea.url.GITEA_RUNNER_REGISTRATION_TOKEN: From Secret.GITEA_RUNNER_EPHEMERAL:"true".GITEA_RUNNER_NAME:{job-name}(Matches Pod name for easier debugging).GITEA_RUNNER_LABELS: Comma-separated list of Effective Labels.- Effective Labels =
spec.labels+ Default Gitea Labels (e.g.,ubuntu-latest:docker://node:16-bullseye,ubuntu-22.04:..., etc.) unless explicitly overridden.
- Effective Labels =
- Name:
6. Gitea API Interaction
- Authentication: Bearer token provided in
authToken. - Endpoints Used:
/api/v1/repos/{owner}/{repo}/actions/jobs(Repo scope)/api/v1/orgs/{org}/actions/jobs(Org scope)/api/v1/users/{user}/repos+/api/v1/repos/{owner}/{repo}/actions/jobs(User scope)/api/v1/admin/actions/jobs(Global scope)
- Label Matching:
- The controller implements logic to check:
Job.Labels ⊆ Runner.EffectiveLabels. - Supports both exact matches (
linux) and schema matches (ubuntu-latestmatchesubuntu-latest:docker://...).
- The controller implements logic to check:
7. Security Considerations
- Token Handling: Tokens are injected via
valueFrom: secretKeyRefenv vars. - Privileged Mode:
act_runnerdind mode requires privileged security context. - Namespace Isolation: Controller operates within the namespace of the RunnerGroup.