Next-Day Workout Retention
Problem
A fitness app records each workout session a member logs in a DataFrame `workout_log` (`workout_log.csv`).
```text
+-------------+--------+
| Column | Type |
+-------------+--------+
| member_id | int |
| device_id | int |
| log_date | object |
| minutes | int |
+-------------+--------+
```
There may be several rows per member. Each row records one day on which the member logged a workout. A member's signup date is the earliest `log_date` they have.
Write a function that reports, for each signup date, how many members signed up on that date (`signups`) and the fraction of those members who logged another workout on the very next day (`next_day_rate`), rounded to 2 decimal places. Return columns `signup_date`, `signups`, and `next_day_rate` in any order.
Input data
Example rows — the live problem includes the full dataset.
| member_id | device_id | log_date | minutes |
|---|---|---|---|
| 1 | 2 | 2022-03-01 | 45 |
| 1 | 2 | 2022-03-02 | 30 |
| 2 | 3 | 2022-04-10 | 20 |
| 3 | 1 | 2022-03-01 | 0 |
| 3 | 4 | 2022-06-15 | 50 |
Expected output
Your answer should return 3 rows with the columns signup_date, signups, next_day_rate.
Starter code (Pandas (Python))
import pandas as pd
def next_day_retention(workout_log: pd.DataFrame) -> pd.DataFrame:
# Your code here
return workout_logSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock