energytrackr.plot.builtin_data_transforms package

Submodules

energytrackr.plot.builtin_data_transforms.commit_details module

Fetch commit metadata for each valid commit.

class energytrackr.plot.builtin_data_transforms.commit_details.CommitDetails[source]

Bases: Transform

Fetch commit metadata (date, summary, files_modified, link) for each valid commit.

Stores a dict in ctx.artefacts[‘commit_details’] keyed by full hash.

apply(ctx: Context) None[source]

Fetch commit metadata for each valid commit.

Parameters:

ctx (Context) – The plotting context containing artefacts and statistics.

energytrackr.plot.builtin_data_transforms.commit_stats module

Commit statistics transform for energy data analysis.

class energytrackr.plot.builtin_data_transforms.commit_stats.CommitStats(**params: dict[str, Any])[source]

Bases: Transform, Configurable[CommitStatsConfig]

Compute commit statistics for a given column in the DataFrame.

Groups the DataFrame by commit, computes median, std, count, filters out commits with too few measurements, and writes:

ctx.stats[“valid_commits”] -> list of commit hashes ctx.stats[“short_hashes”] -> list of 7-char hashes ctx.stats[“x_indices”] -> [0, 1, 2, …] ctx.stats[“medians”] -> list of median values ctx.stats[“y_errors”] -> list of std (NaN->0) ctx.stats[“df_median”] -> the merged DataFrame

apply(ctx: Context) None[source]

Apply the CommitStats transform to the context.

Computes statistics for the specified column in the DataFrame and updates the context with the results.

Parameters:

ctx (Context) – The context containing the DataFrame and other artefacts.

Raises:
class energytrackr.plot.builtin_data_transforms.commit_stats.CommitStatsConfig(column: str | None = None, min_measurements: int | None = None)[source]

Bases: object

Configuration for commit statistics transform.

column: str | None = None
min_measurements: int | None = None

energytrackr.plot.builtin_data_transforms.compute_distribution module

Compute the distribution of a given column for each commit.

class energytrackr.plot.builtin_data_transforms.compute_distribution.ComputeDistribution(**params: dict[str, Any])[source]

Bases: Transform, Configurable[ComputeDistributionConfig]

Compute the distribution of a given column for each commit.

From ctx.stats[‘valid_commits’] and ctx.artefacts[‘df’], build distributions and normality_flags in ctx.artefacts.

apply(ctx: Context) None[source]

Computes and stores the value distributions and normality flags for each commit.

For each commit in the provided context, extracts the values of the specified column, computes their distribution, and tests for normality using the Shapiro-Wilk test if the number of values meets the minimum threshold. The results are stored in the context’s artefacts under “distributions” and “normality_flags”.

Parameters:

ctx (Context) – The context object containing artefacts, statistics, and configuration.

Side Effects:
Updates ctx.artefacts with:
  • “distributions”: List of numpy arrays, each containing the values for a commit.

  • “normality_flags”: List of booleans indicating if the distribution is normal (True) or not (False) for each commit.

class energytrackr.plot.builtin_data_transforms.compute_distribution.ComputeDistributionConfig(column: str | None = None, min_values_for_normality: int = 3, normality_p: float = 0.05)[source]

Bases: object

Configuration for computing distributions of energy measurements.

column: str | None = None
min_values_for_normality: int = 3
normality_p: float = 0.05

energytrackr.plot.builtin_data_transforms.detect_changes module

Detect changes in distributions of energy data over time.

class energytrackr.plot.builtin_data_transforms.detect_changes.ChangeEvent(index: int, direction: str, p_value: float, effect_size: EffectSize, change_magnitude: ChangeMagnitude, context_tags: list[str] | None, level: int)[source]

Bases: object

Represents a change event with associated metadata.

change_magnitude: ChangeMagnitude
context_tags: list[str] | None
direction: str
effect_size: EffectSize
index: int
level: int
p_value: float
class energytrackr.plot.builtin_data_transforms.detect_changes.ChangeMagnitude(pct_change: float, pct_change_level: str, abs_diff: float, practical_level: str)[source]

Bases: object

Represents the magnitude of change between two values, including both percentage and absolute differences.

pct_change

The percentage change between two values.

Type:

float

pct_change_level

A qualitative description of the percentage change (e.g., ‘low’, ‘moderate’, ‘high’).

Type:

str

abs_diff

The absolute difference between two values.

Type:

float

practical_level

A qualitative assessment of the practical significance of the change.

Type:

str

abs_diff: float
pct_change: float
pct_change_level: str
practical_level: str
class energytrackr.plot.builtin_data_transforms.detect_changes.DetectChanges(**params: dict[str, Any])[source]

Bases: Transform, Configurable[DetectChangesConfig]

Detect changes in distributions of energy data over time.

apply(ctx: Context) None[source]

Simplified apply method using helper to process each pair.

Parameters:

ctx (Context) – The context containing the DataFrame and other artefacts.

classify_effect_size(d: float) str[source]

Classify the effect size based on Cohen’s d value.

Parameters:

d (float) – The Cohen’s d effect size.

Returns:

The category of the effect size.

Return type:

str

classify_pct_change(pct: float) str[source]

Classify the percent change based on predefined thresholds.

Parameters:

pct (float) – The percent change value.

Returns:

The category of the percent change.

Return type:

str

classify_practical(abs_diff: float, baseline_median: float) str[source]

Classify the practical significance based on absolute difference and baseline median.

Parameters:
  • abs_diff (float) – The absolute difference between test and baseline medians.

  • baseline_median (float) – The median of the baseline sample.

Returns:

The category of practical significance.

Return type:

str

detect_level_5(ctx: Context, commit: str) bool[source]

Detect level 5 changes based on context tags.

Parameters:
  • ctx (Context) – The context containing the DataFrame and other artefacts.

  • commit (str) – The commit hash to check for level 5 changes.

Returns:

True if level 5 changes are detected, False otherwise.

Return type:

bool

class energytrackr.plot.builtin_data_transforms.detect_changes.DetectChangesConfig(column: str | None = None, thresholds: dict | None = None, tags: list[str] | None = None)[source]

Bases: object

Configuration for detecting changes in distributions of energy measurements.

column: str | None = None
tags: list[str] | None = None
thresholds: dict | None = None
class energytrackr.plot.builtin_data_transforms.detect_changes.EffectSize(cohen_d: float, category: str)[source]

Bases: object

Represents the effect size of a statistical comparison.

cohen_d

The calculated Cohen’s d value indicating the standardized difference between two means.

Type:

float

category

A qualitative description of the effect size (e.g., ‘small’, ‘medium’, ‘large’).

Type:

str

category: str
cohen_d: float

energytrackr.plot.builtin_data_transforms.filter_outliers module

Data transform to filter out outlier commits.

class energytrackr.plot.builtin_data_transforms.filter_outliers.FilterOutliers(**params: dict[str, Any])[source]

Bases: Transform, Configurable[OutlierFilterConfig]

Filter out transient outlier commits from the DataFrame.

property agg: str

Get the aggregation function used for energy measurements.

apply(ctx: Context) None[source]

Filters out transient outlier commits from the DataFrame stored in the context artefacts.

Parameters:

ctx (Context) – The context containing the DataFrame and artefacts.

property commit_col: str

Get the name of the commit column in the DataFrame.

property energy_col: str

Get the name of the energy column in the DataFrame.

property max_run_length: int

Get the maximum run length for transient outlier detection.

property multiplier: float

Get the multiplier for the interquartile range (IQR) in outlier detection.

property window: int

Get the rolling window size for outlier detection.

class energytrackr.plot.builtin_data_transforms.filter_outliers.OutlierFilterConfig(window: int = 20, multiplier: float = 1.5, max_run_length: int = 2, agg: str = 'median', commit_col: str = 'commit_hash', energy_col: str = 'energy_median', min_energy_threshold: float | None = None)[source]

Bases: object

Configuration for filtering energy outliers.

agg: str = 'median'
commit_col: str = 'commit_hash'
energy_col: str = 'energy_median'
max_run_length: int = 2
min_energy_threshold: float | None = None
multiplier: float = 1.5
window: int = 20

energytrackr.plot.builtin_data_transforms.load_csv module

Load a CSV file into a DataFrame.

class energytrackr.plot.builtin_data_transforms.load_csv.LoadCSV(**params: dict[str, Any])[source]

Bases: Transform, Configurable[LoadCSVConfig]

Loads the CSV at ctx.input_path into ctx.artefacts[‘df’].

  • If csv_columns is passed via params or constructor, uses that list.

  • Otherwise falls back to settings.energytrackr.data.csv_columns.

apply(ctx: Context) None[source]

Load the CSV file into a DataFrame and store it in ctx.artefacts[‘df’].

The CSV file is read from ctx.input_path, and the DataFrame is created with the specified column names. If no column names are provided, defaults to the settings defined in settings.energytrackr.data.csv_columns. The DataFrame is then stored in ctx.artefacts[‘df’] for further processing.

Parameters:

ctx (Context) – The context object containing the input path and artefacts.

class energytrackr.plot.builtin_data_transforms.load_csv.LoadCSVConfig(csv_columns: list[str] = <factory>)[source]

Bases: object

Configuration for loading a CSV file into a DataFrame.

csv_columns: list[str]

Module contents

Builtin page sections for the energytrackr package.