Average Listening Sessions per Listener
Problem
You are given a DataFrame `plays` loaded from `plays.csv` with the columns `listener_id`, `session_ref`, `play_day` and `track_genre`. There is no single-column key; each row is one track play inside a listening session.
For the 30-day window ending **2024-09-30 inclusive** (from 2024-09-01 to 2024-09-30), return the average number of distinct sessions per distinct listener, rounded to 2 decimal places. If there are no rows in the window the answer should be 0.00. Name the single output column `avg_sessions_per_listener`.
Example `plays`:
```text
listener_id session_ref play_day track_genre
1 100 2024-09-01 jazz
1 100 2024-09-01 jazz
1 101 2024-09-02 rock
2 200 2024-09-10 pop
```
Expected result is `1.5`: there are 3 distinct sessions (100, 101, 200) across 2 distinct listeners, so 3 / 2 = 1.5.
Input data
Example rows — the live problem includes the full dataset.
| listener_id | session_ref | play_day | track_genre |
|---|---|---|---|
| 1 | 100 | 2024-09-01 | jazz |
| 1 | 100 | 2024-09-01 | jazz |
| 1 | 101 | 2024-09-02 | rock |
| 2 | 200 | 2024-09-10 | pop |
Expected output
Your answer should return 1 row with the columns avg_sessions_per_listener.
Starter code (Pandas (Python))
import pandas as pd
def avg_sessions(plays) -> pd.DataFrame:
# Your code here
return playsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock