Render Node Utilisation
Problem
A render farm logs every job in a DataFrame `render_jobs` with columns `job_id`, `node_id`, `begin_at`, and `finish_at` (the last two are datetime strings). Jobs on the same node can overlap in time.
For each node, report two figures:
- `busy_hours`: the total number of whole hours the node was busy with at least one job. Overlapping jobs are merged so shared time is counted only once, and the merged total is then floored to whole hours.
- `peak_parallel_jobs`: the largest number of jobs that ran simultaneously at any instant.
Return columns `node_id`, `busy_hours`, `peak_parallel_jobs`, ordered by `node_id`.
Input data
Example rows — the live problem includes the full dataset.
| job_id | node_id | begin_at | finish_at |
|---|---|---|---|
| 1 | 501 | 2024-03-10 08:00:00 | 2024-03-10 09:00:00 |
| 2 | 501 | 2024-03-10 08:30:00 | 2024-03-10 10:30:00 |
| 3 | 501 | 2024-03-10 11:00:00 | 2024-03-10 12:00:00 |
| 7 | 501 | 2024-03-10 13:00:00 | 2024-03-10 15:30:00 |
| 4 | 502 | 2024-03-10 09:00:00 | 2024-03-10 10:00:00 |
Expected output
Your answer should return 3 rows with the columns node_id, busy_hours, peak_parallel_jobs.
Starter code (Pandas (Python))
import pandas as pd
def render_node_utilisation(render_jobs) -> pd.DataFrame:
# Your code here
return render_jobsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock