AnalystPath

Solar Panels Generating More Per Sun-Hour

PandasMediumMid level~10 min

Problem

You are given two DataFrames. `panels` has columns `panel_id` and `site_label`. `generation_logs` has columns `log_id`, `panel_id`, `logged_on`, `energy_kwh`, and `sun_hours`; the yield for a log is `energy_kwh` divided by `sun_hours`.

Find every panel whose average yield improved from the first half of the year (January through June) to the second half (July through December). Only include panels that have at least one log in each half. Report the gain as the second-half average yield minus the first-half average yield, rounded to two decimals.

Return a DataFrame with columns `site_label` and `yield_gain`, ordered by `yield_gain` descending, then `site_label` ascending.

Input data

Example rows — the live problem includes the full dataset.

panels
panel_idsite_label
1Rooftop A
2Rooftop B
3Carport C
4Field D
5Field E
generation_logs
log_idpanel_idlogged_onenergy_kwhsun_hours
112024-02-10408
212024-03-10459
312024-08-10707
412024-09-10606
522024-01-15306

Expected output

Your answer should return 3 rows with the columns site_label, yield_gain.

Starter code (Pandas (Python))

import pandas as pd

def solar_panels_generating_more_per_sun_hour(panels, generation_logs) -> pd.DataFrame:
    # Your code here
    return panels

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions