energytrackr.plot.builtin_data_transforms package¶
Submodules¶
energytrackr.plot.builtin_data_transforms.commit_details module¶
Fetch commit metadata for each valid commit.
energytrackr.plot.builtin_data_transforms.commit_stats module¶
Commit statistics transform for energy data analysis.
- class energytrackr.plot.builtin_data_transforms.commit_stats.CommitStats(**params: dict[str, Any])[source]¶
Bases:
Transform,Configurable[CommitStatsConfig]Compute commit statistics for a given column in the DataFrame.
- Groups the DataFrame by commit, computes median, std, count, filters out commits with too few measurements, and writes:
ctx.stats[“valid_commits”] -> list of commit hashes ctx.stats[“short_hashes”] -> list of 7-char hashes ctx.stats[“x_indices”] -> [0, 1, 2, …] ctx.stats[“medians”] -> list of median values ctx.stats[“y_errors”] -> list of std (NaN->0) ctx.stats[“df_median”] -> the merged DataFrame
- apply(ctx: Context) None[source]¶
Apply the CommitStats transform to the context.
Computes statistics for the specified column in the DataFrame and updates the context with the results.
- Parameters:
ctx (Context) – The context containing the DataFrame and other artefacts.
- Raises:
CommitStatsMissingOrEmptyDataFrameError – If the DataFrame is missing or empty, or if the specified column
is not found. –
energytrackr.plot.builtin_data_transforms.compute_distribution module¶
Compute the distribution of a given column for each commit.
- class energytrackr.plot.builtin_data_transforms.compute_distribution.ComputeDistribution(**params: dict[str, Any])[source]¶
Bases:
Transform,Configurable[ComputeDistributionConfig]Compute the distribution of a given column for each commit.
From ctx.stats[‘valid_commits’] and ctx.artefacts[‘df’], build distributions and normality_flags in ctx.artefacts.
- apply(ctx: Context) None[source]¶
Computes and stores the value distributions and normality flags for each commit.
For each commit in the provided context, extracts the values of the specified column, computes their distribution, and tests for normality using the Shapiro-Wilk test if the number of values meets the minimum threshold. The results are stored in the context’s artefacts under “distributions” and “normality_flags”.
- Parameters:
ctx (Context) – The context object containing artefacts, statistics, and configuration.
- Side Effects:
- Updates ctx.artefacts with:
“distributions”: List of numpy arrays, each containing the values for a commit.
“normality_flags”: List of booleans indicating if the distribution is normal (True) or not (False) for each commit.
- class energytrackr.plot.builtin_data_transforms.compute_distribution.ComputeDistributionConfig(column: str | None = None, min_values_for_normality: int = 3, normality_p: float = 0.05)[source]¶
Bases:
objectConfiguration for computing distributions of energy measurements.
- min_values_for_normality: int = 3¶
- normality_p: float = 0.05¶
energytrackr.plot.builtin_data_transforms.detect_changes module¶
Detect changes in distributions of energy data over time.
- class energytrackr.plot.builtin_data_transforms.detect_changes.ChangeEvent(index: int, direction: str, p_value: float, effect_size: EffectSize, change_magnitude: ChangeMagnitude, context_tags: list[str] | None, level: int)[source]
Bases:
objectRepresents a change event with associated metadata.
- change_magnitude: ChangeMagnitude
- context_tags: list[str] | None
- direction: str
- effect_size: EffectSize
- index: int
- level: int
- p_value: float
- class energytrackr.plot.builtin_data_transforms.detect_changes.ChangeMagnitude(pct_change: float, pct_change_level: str, abs_diff: float, practical_level: str)[source]
Bases:
objectRepresents the magnitude of change between two values, including both percentage and absolute differences.
- pct_change
The percentage change between two values.
- Type:
float
- pct_change_level
A qualitative description of the percentage change (e.g., ‘low’, ‘moderate’, ‘high’).
- Type:
str
- abs_diff
The absolute difference between two values.
- Type:
float
- practical_level
A qualitative assessment of the practical significance of the change.
- Type:
str
- abs_diff: float
- pct_change: float
- pct_change_level: str
- practical_level: str
- class energytrackr.plot.builtin_data_transforms.detect_changes.DetectChanges(**params: dict[str, Any])[source]
Bases:
Transform,Configurable[DetectChangesConfig]Detect changes in distributions of energy data over time.
- apply(ctx: Context) None[source]
Simplified apply method using helper to process each pair.
- Parameters:
ctx (Context) – The context containing the DataFrame and other artefacts.
- classify_effect_size(d: float) str[source]
Classify the effect size based on Cohen’s d value.
- Parameters:
d (float) – The Cohen’s d effect size.
- Returns:
The category of the effect size.
- Return type:
str
- classify_pct_change(pct: float) str[source]
Classify the percent change based on predefined thresholds.
- Parameters:
pct (float) – The percent change value.
- Returns:
The category of the percent change.
- Return type:
str
- classify_practical(abs_diff: float, baseline_median: float) str[source]
Classify the practical significance based on absolute difference and baseline median.
- Parameters:
abs_diff (float) – The absolute difference between test and baseline medians.
baseline_median (float) – The median of the baseline sample.
- Returns:
The category of practical significance.
- Return type:
str
- detect_level_5(ctx: Context, commit: str) bool[source]
Detect level 5 changes based on context tags.
- Parameters:
ctx (Context) – The context containing the DataFrame and other artefacts.
commit (str) – The commit hash to check for level 5 changes.
- Returns:
True if level 5 changes are detected, False otherwise.
- Return type:
bool
- class energytrackr.plot.builtin_data_transforms.detect_changes.DetectChangesConfig(column: str | None = None, thresholds: dict | None = None, tags: list[str] | None = None)[source]
Bases:
objectConfiguration for detecting changes in distributions of energy measurements.
- column: str | None = None
- tags: list[str] | None = None
- thresholds: dict | None = None
- class energytrackr.plot.builtin_data_transforms.detect_changes.EffectSize(cohen_d: float, category: str)[source]
Bases:
objectRepresents the effect size of a statistical comparison.
- cohen_d
The calculated Cohen’s d value indicating the standardized difference between two means.
- Type:
float
- category
A qualitative description of the effect size (e.g., ‘small’, ‘medium’, ‘large’).
- Type:
str
- category: str
- cohen_d: float
energytrackr.plot.builtin_data_transforms.filter_outliers module¶
Data transform to filter out outlier commits.
- class energytrackr.plot.builtin_data_transforms.filter_outliers.FilterOutliers(**params: dict[str, Any])[source]¶
Bases:
Transform,Configurable[OutlierFilterConfig]Filter out transient outlier commits from the DataFrame.
- property agg: str¶
Get the aggregation function used for energy measurements.
- apply(ctx: Context) None[source]¶
Filters out transient outlier commits from the DataFrame stored in the context artefacts.
- Parameters:
ctx (Context) – The context containing the DataFrame and artefacts.
- property commit_col: str¶
Get the name of the commit column in the DataFrame.
- property energy_col: str¶
Get the name of the energy column in the DataFrame.
- property max_run_length: int¶
Get the maximum run length for transient outlier detection.
- property multiplier: float¶
Get the multiplier for the interquartile range (IQR) in outlier detection.
- property window: int¶
Get the rolling window size for outlier detection.
- class energytrackr.plot.builtin_data_transforms.filter_outliers.OutlierFilterConfig(window: int = 20, multiplier: float = 1.5, max_run_length: int = 2, agg: str = 'median', commit_col: str = 'commit_hash', energy_col: str = 'energy_median', min_energy_threshold: float | None = None)[source]¶
Bases:
objectConfiguration for filtering energy outliers.
- agg: str = 'median'¶
- commit_col: str = 'commit_hash'¶
- energy_col: str = 'energy_median'¶
- max_run_length: int = 2¶
- multiplier: float = 1.5¶
- window: int = 20¶
energytrackr.plot.builtin_data_transforms.load_csv module¶
Load a CSV file into a DataFrame.
- class energytrackr.plot.builtin_data_transforms.load_csv.LoadCSV(**params: dict[str, Any])[source]¶
Bases:
Transform,Configurable[LoadCSVConfig]Loads the CSV at ctx.input_path into ctx.artefacts[‘df’].
If csv_columns is passed via params or constructor, uses that list.
Otherwise falls back to settings.energytrackr.data.csv_columns.
- apply(ctx: Context) None[source]¶
Load the CSV file into a DataFrame and store it in ctx.artefacts[‘df’].
The CSV file is read from ctx.input_path, and the DataFrame is created with the specified column names. If no column names are provided, defaults to the settings defined in settings.energytrackr.data.csv_columns. The DataFrame is then stored in ctx.artefacts[‘df’] for further processing.
- Parameters:
ctx (Context) – The context object containing the input path and artefacts.
Module contents¶
Builtin page sections for the energytrackr package.