mirror of
https://github.com/bapung/gitea-runner-operator.git
synced 2026-06-21 23:48:43 +00:00
7.0 KiB
7.0 KiB
Gitea Runner Operator Specification
1. Overview
The Gitea Runner Operator is a Kubernetes controller designed to manage ephemeral Gitea Act runners. It automates the provisioning of runner pods based on the demand of queued jobs in a Gitea instance. By defining RunnerGroup resources, users can configure pools of runners with specific scopes (global, organization, or repository) and labels.
2. Terminology
- CRD: Custom Resource Definition.
- RunnerGroup CR: The custom resource instance defining a runner pool.
- Ephemeral Runner: A runner that executes exactly one job and then terminates.
- Gitea Instance: The target Gitea server where CI/CD workflows are triggered.
3. Custom Resource Definition (CRD)
3.1 Metadata
- Group:
gitea.bpg.pw - Version:
v1alpha1 - Kind:
RunnerGroup - Scope: Namespaced
3.2 Spec Schema
The spec defines the configuration for the runner pool.
| Field | Type | Required | Description |
|---|---|---|---|
scope |
Enum (global, org, repo) |
Yes | The scope of the runner. |
org |
String | Conditional | The organization name. Required if scope is org. |
repo |
String | Conditional | The repository name. Required if scope is repo. |
gitea.url |
String | Yes | The base URL of the Gitea instance (e.g., https://gitea.example.com). |
labels |
[]String | No | List of labels for the runner (e.g., ubuntu-latest, app:infra). Used by Gitea to match jobs to runners. |
maxActiveRunners |
Integer | Yes | The maximum number of concurrent runner Jobs allowed for this specific RunnerGroup CR. |
registrationToken |
SecretKeySelector | Yes | Reference to a Secret containing the runner registration token. |
authToken |
SecretKeySelector | Yes | Reference to a Secret containing an API token to query Gitea for job statuses. |
3.2.1 SecretKeySelector
Standard Kubernetes Secret reference:
secretRef.name: Name of the secret.secretRef.key: Key within the secret containing the value.
3.3 Status Schema (Optional but Recommended)
activeRunners: Integer. Current count of running Jobs managed by this CR.lastCheckTime: Timestamp. Last time the controller polled Gitea.
4. Controller Logic
4.1 Reconciliation Loop
The controller watches for changes to RunnerGroup resources.
- Validation: Ensure
orgorrepoare present based onscope. - Job Cleanup: (Optional) Check for and remove "stuck" jobs if TTL doesn't cover edge cases, though
ttlSecondsAfterFinishedis primary. - Metric Collection: Update status with current running job count.
- Polling: The controller must implement a polling mechanism (loop) independent of the standard Reconcile trigger, or requeue the Reconcile event periodically (e.g., every 10-30 seconds).
4.2 Polling & Scaling Logic
On every poll interval for a specific RunnerGroup CR:
-
Check Capacity:
- Query Kubernetes for active
Jobsowned by thisRunnerGroupCR. - If
count(active_jobs) >= maxActiveRunners, stop. Do not spawn new runners.
- Query Kubernetes for active
-
Fetch Queued Jobs:
- Call Gitea API using
authToken. - Endpoint depends on scope:
- Global: Recursively fetch all workflow runs:
- Fetch all organizations in the Gitea instance
- For each organization, fetch all repositories under that org
- For each repository, query
/repos/{owner}/{repo}/actions/runs?status=queued - Additionally, fetch all user-owned repositories and query their workflow runs
- Org: Fetch all workflow runs in repos under the organization:
- Fetch all repositories under the specified organization
- For each repository, query
/repos/{owner}/{repo}/actions/runs?status=queued
- Repo: Directly query
/repos/{owner}/{repo}/actions/runs?status=queued
- Global: Recursively fetch all workflow runs:
- Filter the returned runs:
- Must match the
labelsdefined in theRunnerGroupCR.
- Must match the
- Call Gitea API using
-
Spawn Runner:
- If a queued job is found and capacity allows, create a Kubernetes
Job. - One Job per Queued Workflow: Ideally, the logic should map 1 queued run -> 1 Runner Job.
- Concurrency Control: Ensure we don't spawn more jobs than
maxActiveRunners - currentActiveRunners.
- If a queued job is found and capacity allows, create a Kubernetes
5. Kubernetes Resource Generation
5.1 Job Specification
The controller creates a batch/v1 Job.
Metadata:
name:{runnergroup-cr-name}-{random-suffix}namespace: Same asRunnerGroupCR.labels:app:{runnergroup-cr-name}gitea.bpg.pw/managed-by:gitea-runner-operatorgitea.bpg.pw/runnergroup-name:{runnergroup-cr-name}
ownerReferences: Pointing to theRunnerGroupCR.
Spec:
ttlSecondsAfterFinished: 600 (Clean up finished jobs).template:spec:restartPolicy:OnFailurecontainers:- Name:
runner - Image:
gitea/act_runner:nightly-dind-rootless(Default, potentially configurable in CR later). - SecurityContext:
privileged: true(Required for DIND). - Env:
GITEA_INSTANCE_URL: Fromspec.gitea.url.GITEA_RUNNER_REGISTRATION_TOKEN: Fromspec.registrationToken.GITEA_RUNNER_EPHEMERAL:"true".GITEA_RUNNER_LABELS: Comma-separated list fromspec.labels.DOCKER_HOST:tcp://localhost:2376
- VolumeMounts:
- Mount docker socket or storage if necessary. The README example uses a PVC
act-runner-volmounted to/data. Note: Using a shared PVC for ephemeral runners might cause race conditions. EmptyDir is preferred for truly ephemeral runners unless caching is strictly required and managed.
- Mount docker socket or storage if necessary. The README example uses a PVC
- Name:
6. Gitea API Interaction
- Authentication: Bearer token provided in
authToken. - Client: HTTP Client with timeout.
7. Security Considerations
- Token Handling: Registration and Auth tokens are read from Kubernetes Secrets and injected as Environment Variables. They are not stored in plain text in the CR.
- Privileged Mode: The default
act_runnerimage (dind) requires privileged mode. The Operator creates Jobs with this permission. - Namespace Isolation: The Operator should respect RBAC and only operate within allowed namespaces.