AnalystPath

Most Recent 2022 Backup

PandasEasyJunior level~10 min

Problem

A cloud storage app logs every time a device uploads a backup. You are given a DataFrame `backups` loaded from `backups.csv`:

```text
+-----------+----------+
| Column | Type |
+-----------+----------+
| device_id | int |
| run_at | datetime |
+-----------+----------+
(device_id, run_at) is the primary key. Each row records one backup run by a device at a given moment.
```

For every device that ran at least one backup during the year **2022**, report the timestamp of its most recent 2022 backup. Devices with no 2022 backup must not appear.

Return a DataFrame with columns `device_id` and `latest_run`, in any order.

Input data

Example rows — the live problem includes the full dataset.

backups
device_idrun_at
62022-06-30 15:06:07
62023-04-21 14:06:06
62021-03-07 00:18:15
82022-02-01 05:10:53
82022-12-30 00:46:50

Expected output

Your answer should return 3 rows with the columns device_id, latest_run.

Starter code (Pandas (Python))

import pandas as pd

def most_recent_2022_backup(backups) -> pd.DataFrame:
    # Your code here
    return backups

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions