AnalystPath

Branch Energy Spend vs Chain Average

PandasHardSenior level~10 min

Problem

Two DataFrames track store utility costs: `energy_bill(bill_id, store_id, cost, billed_on)` with one bill per store per billing date, and `store(store_id, region_id)` mapping each store to a region.

For each billing month and region, compare the region's average bill `cost` that month against the whole chain's average `cost` that month and label it:

- `higher` — the region average is above the chain average,
- `lower` — below,
- `same` — equal.

Return one row per month/region with columns `bill_month` (formatted `YYYY-MM`), `region_id`, and `verdict`, ordered by `bill_month` descending then `region_id` ascending.

Input data

Example rows — the live problem includes the full dataset.

energy_bill
bill_idstore_idcostbilled_on
1190002024-03-31
2260002024-03-31
33100002024-03-31
4170002024-02-29
5260002024-02-29
store
store_idregion_id
11
22
32

Expected output

Your answer should return 4 rows with the columns bill_month, region_id, verdict.

Starter code (Pandas (Python))

import pandas as pd

def region_vs_chain_spend(energy_bill: pd.DataFrame, store: pd.DataFrame) -> pd.DataFrame:
    # Your code here
    return energy_bill

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions