Most Recent 2022 Backup
Problem
A cloud storage app logs every time a device uploads a backup. You are given a DataFrame `backups` loaded from `backups.csv`:
```text
+-----------+----------+
| Column | Type |
+-----------+----------+
| device_id | int |
| run_at | datetime |
+-----------+----------+
(device_id, run_at) is the primary key. Each row records one backup run by a device at a given moment.
```
For every device that ran at least one backup during the year **2022**, report the timestamp of its most recent 2022 backup. Devices with no 2022 backup must not appear.
Return a DataFrame with columns `device_id` and `latest_run`, in any order.
Input data
Example rows — the live problem includes the full dataset.
| device_id | run_at |
|---|---|
| 6 | 2022-06-30 15:06:07 |
| 6 | 2023-04-21 14:06:06 |
| 6 | 2021-03-07 00:18:15 |
| 8 | 2022-02-01 05:10:53 |
| 8 | 2022-12-30 00:46:50 |
Expected output
Your answer should return 3 rows with the columns device_id, latest_run.
Starter code (Pandas (Python))
import pandas as pd
def most_recent_2022_backup(backups) -> pd.DataFrame:
# Your code here
return backupsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock