AnalystPath

Daily Distinct Gym Visitors

PandasEasyJunior level~10 min

Problem

You are given a DataFrame `checkins` loaded from `checkins.csv` with the columns `member_id`, `visit_id`, `visit_day` and `zone`. There is no single-column key: a member can pass through the turnstile many times in one day, and each row is one entry.

A member is *active* on a day if they have at least one check-in that day. For each day in the 30-day window ending **2024-05-31 inclusive** (from 2024-05-02 to 2024-05-31), return the day and the number of distinct active members. Name the columns `day` and `active_members`, and only include days that actually have at least one check-in.

Example `checkins`:

```text
member_id visit_id visit_day zone
1 10 2024-05-02 weights
1 11 2024-05-02 cardio
2 12 2024-05-02 pool
3 13 2024-05-31 weights
```

Expected result:

```text
day active_members
2024-05-02 2
2024-05-31 1
```

On 2024-05-02 members 1 and 2 were active (member 1's two check-ins count once). On 2024-05-31 only member 3 was active.

Input data

Example rows — the live problem includes the full dataset.

checkins
member_idvisit_idvisit_dayzone
1102024-05-02weights
1112024-05-02cardio
2122024-05-02pool
3132024-05-31weights

Expected output

Your answer should return 2 rows with the columns day, active_members.

Starter code (Pandas (Python))

import pandas as pd

def daily_active_members(checkins) -> pd.DataFrame:
    # Your code here
    return checkins

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions