mirror of
https://github.com/bapung/gitea-runner-operator.git
synced 2026-06-21 23:48:43 +00:00
initial commit for working reconciliation logic, no automated test only manually tested for now
188 lines
6.4 KiB
Markdown
188 lines
6.4 KiB
Markdown
# Gitea Runner Operator Implementation Guide
|
|
|
|
This document outlines the technical implementation details for the Gitea Runner Operator. It is intended for developers building the operator using Go and the Operator SDK.
|
|
|
|
## 1. Development Environment & Tools
|
|
|
|
- **Language**: Go (1.21+)
|
|
- **Framework**: Operator SDK (v1.33+) with Controller Runtime
|
|
- **Kubernetes API**: v1.28+
|
|
- **Container Runtime**: Docker or Podman
|
|
|
|
## 2. Project Initialization
|
|
|
|
Initialize the project using the Operator SDK:
|
|
|
|
```bash
|
|
operator-sdk init --domain bpg.pw --repo github.com/bapung/gitea-runner-operator
|
|
operator-sdk create api --group gitea --version v1alpha1 --kind RunnerGroup --resource --controller
|
|
```
|
|
|
|
## 3. API Definition (`api/v1alpha1/runnergroup_types.go`)
|
|
|
|
Define the `RunnerGroup` Custom Resource Definition (CRD) in Go structs.
|
|
|
|
### 3.1 RunnerGroupSpec
|
|
|
|
```go
|
|
type RunnerGroupScope string
|
|
|
|
const (
|
|
RunnerGroupScopeGlobal RunnerGroupScope = "global"
|
|
RunnerGroupScopeOrg RunnerGroupScope = "org"
|
|
RunnerGroupScopeUser RunnerGroupScope = "user"
|
|
RunnerGroupScopeRepo RunnerGroupScope = "repo"
|
|
)
|
|
|
|
type RunnerGroupSpec struct {
|
|
// Scope defines the scope of the runner (global, org, user, repo)
|
|
// +kubebuilder:validation:Enum=global;org;user;repo
|
|
Scope RunnerScope `json:"scope"`
|
|
|
|
// Org is required if scope is 'org'
|
|
// +optional
|
|
Org string `json:"org,omitempty"`
|
|
|
|
// User is required if scope is 'user'
|
|
// +optional
|
|
User string `json:"user,omitempty"`
|
|
|
|
// Repo is required if scope is 'repo'
|
|
// +optional
|
|
Repo string `json:"repo,omitempty"`
|
|
|
|
// GiteaURL is the base URL of the Gitea instance
|
|
GiteaURL string `json:"giteaURL"`
|
|
|
|
// Labels to assign to the runner.
|
|
// Defaults (e.g. ubuntu-latest) are merged automatically by the controller.
|
|
// +optional
|
|
Labels []string `json:"labels,omitempty"`
|
|
|
|
// MaxActiveRunners is the maximum number of concurrent jobs
|
|
// +kubebuilder:validation:Minimum=1
|
|
MaxActiveRunners int `json:"maxActiveRunners"`
|
|
|
|
// RegistrationTokenRef references the secret containing the runner registration token
|
|
RegistrationTokenRef corev1.SecretKeySelector `json:"registrationToken"`
|
|
|
|
// AuthTokenRef references the secret containing the Gitea API token for polling
|
|
AuthTokenRef corev1.SecretKeySelector `json:"authToken"`
|
|
}
|
|
```
|
|
|
|
### 3.2 RunnerGroupStatus
|
|
|
|
```go
|
|
type RunnerGroupStatus struct {
|
|
// ActiveRunners is the current number of running jobs
|
|
ActiveRunners int `json:"activeRunners"`
|
|
|
|
// LastCheckTime is the timestamp of the last poll to Gitea
|
|
LastCheckTime *metav1.Time `json:"lastCheckTime,omitempty"`
|
|
}
|
|
```
|
|
|
|
## 4. Controller Implementation (`internal/controller/runnergroup_controller.go`)
|
|
|
|
The controller handles the reconciliation loop and manages the lifecycle of ephemeral runners.
|
|
|
|
### 4.1 Struct Definition
|
|
|
|
The reconciler includes a thread-safe map to cache spawned jobs and prevent duplicate scheduling.
|
|
|
|
```go
|
|
type RunnerGroupReconciler struct {
|
|
client.Client
|
|
Scheme *runtime.Scheme
|
|
GiteaClient gitea.Client
|
|
SpawnedJobsCache sync.Map // Stores [int64]time.Time (JobID -> SpawnTime)
|
|
}
|
|
```
|
|
|
|
### 4.2 Reconcile Logic
|
|
|
|
The `Reconcile` function follows this flow:
|
|
|
|
1. **Fetch RunnerGroup**: Get the `RunnerGroup` CR instance.
|
|
2. **List Jobs**: List all `batchv1.Job` resources owned by this CR to calculate `activeRunners`.
|
|
3. **Update Status**: Update `status.activeRunners`.
|
|
4. **Capacity Check**: Stop scaling if `activeRunners >= spec.maxActiveRunners`.
|
|
5. **Label Calculation**: Call `getEffectiveLabels` to merge `spec.labels` with hardcoded Gitea defaults (e.g., `ubuntu-latest:docker://node:16-bullseye`).
|
|
6. **Poll Gitea**:
|
|
- Retrieve Auth Token.
|
|
- Call `GiteaClient.GetRunnerStats` with the effective labels.
|
|
- This returns a list of `QueuedJobs`.
|
|
7. **Scale Up & Deduplication**:
|
|
- Iterate through `stats.QueuedJobs`.
|
|
- **Check Cache**: If Job ID exists in `SpawnedJobsCache`:
|
|
- If TTL (< 5 min) is valid: **Skip** (already handled).
|
|
- If TTL expired: **Retry** (assume previous runner failed).
|
|
- If Job ID not in cache or expired:
|
|
- Check `availableSlots`.
|
|
- Retrieve Registration Token (if not yet fetched).
|
|
- **Spawn Job**: Create `batchv1.Job`.
|
|
- **Update Cache**: Store Job ID in `SpawnedJobsCache`.
|
|
- Decrement `availableSlots`.
|
|
8. **Cache Cleanup**: Remove IDs from `SpawnedJobsCache` if they are not present in the latest `QueuedJobs` list from Gitea.
|
|
9. **Requeue**: Return `ctrl.Result{RequeueAfter: 10 * time.Second}`.
|
|
|
|
### 4.3 Helper Functions
|
|
|
|
#### getEffectiveLabels
|
|
|
|
Merges user-defined labels with Gitea defaults. If a user defines `ubuntu-latest`, it overrides the default `ubuntu-latest:docker://...`.
|
|
|
|
#### constructJobForRunnerGroup
|
|
|
|
Creates the Job object with:
|
|
|
|
- **Name**: `{runnergroup-name}-{random-suffix}`
|
|
- **Env**:
|
|
- `GITEA_RUNNER_NAME`: Set to the Job name.
|
|
- `GITEA_RUNNER_LABELS`: Comma-separated effective labels.
|
|
- Standard runner envs (`GITEA_INSTANCE_URL`, etc).
|
|
|
|
## 5. Gitea Client (`internal/gitea/client.go`)
|
|
|
|
A specialized client to interact with Gitea's Actions API.
|
|
|
|
### 5.1 Interface
|
|
|
|
```go
|
|
type RunnerStats struct {
|
|
QueuedJobs []ActionWorkflowJob
|
|
Running int
|
|
}
|
|
|
|
type Client interface {
|
|
GetRunnerStats(ctx context.Context, giteaURL, authToken string, scope RunnerGroupScope, org, repo string, labels []string) (*RunnerStats, error)
|
|
}
|
|
```
|
|
|
|
### 5.2 Logic
|
|
|
|
1. **Endpoints**:
|
|
- Repo/Org/Global: Uses `/actions/jobs` endpoints.
|
|
- User: Fetches repos via `/users/{user}/repos`, then queries `/actions/jobs` for each repo.
|
|
2. **Fetching**:
|
|
- Fetches jobs with `status=queued`, `waiting`, `pending`.
|
|
- Handles pagination (fetches all pages).
|
|
3. **Filtering**:
|
|
- Iterates through fetched jobs.
|
|
- **Matches Labels**: Checks if the job's required labels are a subset of the runner's supported labels (effective labels).
|
|
- Supports exact match (`linux` == `linux`)
|
|
- Supports schema match (`ubuntu-latest` matches `ubuntu-latest:docker://...`)
|
|
- Returns only matching jobs in `QueuedJobs`.
|
|
|
|
## 6. Testing Strategy
|
|
|
|
1. **Unit Tests (`internal/gitea/client_test.go`)**:
|
|
- Mock Gitea API server.
|
|
- Verify `GetRunnerStats` correctly parses JSON and handles pagination.
|
|
- Verify label matching logic (subset, schema matching).
|
|
2. **Controller Tests**:
|
|
- Verify `SpawnedJobsCache` prevents double scheduling.
|
|
- Verify TTL logic allows retries for stuck jobs.
|
|
- Verify `getEffectiveLabels` merging logic.
|