Listener Engagement Profile on a Music Service
Problem
You are given `plays` (from `plays.csv`: `play_id`, `listener_id`, `track_id`, `played_on` date, `seconds_listened` float) and `tracks` (from `tracks.csv`: `track_id`, `genre`, `duration` float). Inner-join plays to tracks on `track_id`, then for each listener compute: `total_seconds` = round(sum of seconds_listened, 2); `play_count` = number of plays; `unique_genres` = count of distinct genres; `avg_seconds` = round(mean seconds_listened, 2); `top_genre` = the genre the listener played most often, breaking ties by the most recent `played_on`; and `engagement_score` = round(play_count*10 + total_seconds/100, 2). Return columns `listener_id`, `total_seconds`, `play_count`, `unique_genres`, `avg_seconds`, `top_genre`, `engagement_score`, sorted by `engagement_score` descending then `listener_id` ascending.
Input data
Example rows — the live problem includes the full dataset.
| track_id | genre | duration |
|---|---|---|
| 1 | Jazz | 240.00 |
| 2 | Jazz | 200.00 |
| 3 | Techno | 300.00 |
| 4 | Folk | 180.00 |
| play_id | listener_id | track_id | played_on | seconds_listened |
|---|---|---|---|---|
| 1 | 500 | 1 | 2024-01-05 | 120.00 |
| 2 | 500 | 2 | 2024-01-08 | 200.00 |
| 3 | 500 | 3 | 2024-02-01 | 300.00 |
| 4 | 600 | 3 | 2024-01-10 | 100.00 |
| 5 | 600 | 4 | 2024-03-15 | 150.00 |
Expected output
Your answer should return 2 rows with the columns listener_id, total_seconds, play_count, unique_genres, avg_seconds, top_genre, engagement_score.
Starter code (Pandas (Python))
import pandas as pd
def listener_profile(plays, tracks) -> pd.DataFrame:
# Your code here
return playsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock