AnalystPath

Folder Tree Depth, Descendants and Rolled-Up Size

PandasHardSenior level~10 min

Problem

You are given a DataFrame `folders` (loaded from `folders.csv`) with columns `folder_id`, `folder_name`, `parent_id`, `own_bytes`, and `owner`. `parent_id` references the `folder_id` of the containing folder; the single root folder has `parent_id` empty (NaN). `own_bytes` is the size of files stored directly in that folder, not counting subfolders.

For every folder, report:

- `depth`: the root folder is at depth 1, its direct children at depth 2, and so on.
- `descendant_count`: how many folders sit strictly below it (its entire subtree, excluding itself).
- `rolled_bytes`: the total `own_bytes` of the folder plus every folder in its subtree.

Return `folder_id`, `folder_name`, `depth`, `descendant_count`, and `rolled_bytes`, ordered by `depth` ascending, then `rolled_bytes` descending, then `folder_name` ascending.

Input data

Example rows — the live problem includes the full dataset.

folders
folder_idfolder_nameparent_idown_bytesowner
1root12000ops
2media110000design
3code110000eng
4photos27500design
5videos27500design

Expected output

Your answer should return 10 rows with the columns folder_id, folder_name, depth, descendant_count, rolled_bytes.

Starter code (Pandas (Python))

import pandas as pd

def folder_tree_stats(folders) -> pd.DataFrame:
    # Your code here
    return folders

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions