βš™οΈ Pipeline StagesΒΆ

The Energy Pipeline is organized into three types of stages, each responsible for a part of the commit processing workflow.

Each stage operates on a shared context dictionary that allows reading and writing of state and data.


🧱 Stage Categories¢

Stage Type

Runs When?

Purpose

Pre-Stages

Once per batch (before any commits)

Verify system setup before running any commit

Pre-Test Stages

Once per unique commit (in parallel)

Prepare the environment and build the commit

Batch Stages

For every run of every commit (N times)

Run test and measure energy consumption


πŸ“¦ Shared ContextΒΆ

Each stage receives a context object:

context: dict[str, Any]

Common fields in context:

  • commit: git.Commit object (or commit hash string in subprocesses)

  • repo_path: path to local Git repo

  • build_failed: bool flag set if a build fails

  • abort_pipeline: bool flag to stop processing


🧩 Stage Interface¢

All stages implement the same interface via an abstract base class:

class PipelineStage(ABC):
    @abstractmethod
    def run(self, context: dict[str, Any]) -> None:
        ...

πŸ” Pre-Stages (run once per batch)ΒΆ

These are safety checks before anything is processed.

βœ… VerifyPerfStageΒΆ

  • Verifies perf access permissions.

  • Logs system state and capabilities.

  • Aborts the pipeline if setup is incorrect.


πŸ—οΈ Pre-Test Stages (run once per commit, in parallel)ΒΆ

These prepare each commit for measurement. They run concurrently to speed up processing.

πŸ“ CopyDirectoryStageΒΆ

  • Copies the repository to a fresh directory for this commit.

  • Ensures isolation between batches.

🏷 SetDirectoryStage¢

  • Sets the working directory in context to the correct copied repo path.

🌲 CheckoutStage¢

  • Checks out the specified commit in the local copy of the repo.

β˜• JavaSetupStageΒΆ

  • Detects Java version from Maven pom.xml.

  • Sets environment variables to match that version.

πŸ”¨ BuildStageΒΆ

  • Builds the project using the test command.

  • Marks the commit as build_failed in context if it fails.


πŸ” Batch Stages (run for each execution of each commit)ΒΆ

These stages are run num_runs * num_repeats times per commit.

🌑️ TemperatureCheckStage¢

  • Monitors CPU temperature.

  • Waits or aborts if too hot (to avoid noise in energy readings).

πŸ“ SetDirectoryStage (again)ΒΆ

  • Redundant set for safety in parallel runs.

β˜• JavaSetupStage (again)ΒΆ

  • Re-applies Java settings to ensure consistent environment.

⚑ MeasureEnergyStage¢

  • Runs the test command.

  • Collects energy metrics from RAPL (e.g., energy-pkg, energy-core, energy-gpu).

  • Saves them to a result file with the commit hash.

🧹 PostTestStage¢

  • Cleans up temporary files or resets settings if needed.


🧠 Execution Flow Summary¢

[Measure Command]
 └── Batches Commits (X batches of Y commits)
     └── For Each Batch:
         β”œβ”€β”€ Run Pre-Stages
         β”œβ”€β”€ For Each Unique Commit in Parallel:
         β”‚   └── Pre-Test Stages (checkout, build, etc.)
         └── For Each Commit N times:
             └── Batch Stages (test, measure, etc.)

⚠️ Stage Aborts¢

Any stage can stop the pipeline by setting:

context["abort_pipeline"] = True

Or, mark that a commit should be skipped due to a failed build:

context["build_failed"] = True

βœ… Adding New StagesΒΆ

  1. Create a new class implementing PipelineStage.

  2. Add it to one of these lists in main.py:

    pre_stages = [...]
    pre_test_stages = [...]
    batch_stages = [...]
    
  3. The pipeline will pick it up automatically.


πŸ’‘ Example: Writing a Custom StageΒΆ

class LogCommitStage(PipelineStage):
    def run(self, context: dict[str, Any]) -> None:
        commit = context.get("commit")
        print(f"Processing commit: {commit}")

Then add to batch_stages:

batch_stages = [
    TemperatureCheckStage(),
    LogCommitStage(),
    MeasureEnergyStage(),
]