AnalystPath

Busiest Subway Station

PandasMediumMid level~10 min

Problem

A transit network records daily train trips in a DataFrame `train_trips` (loaded from a CSV) with columns `(origin_station, destination_station, trip_total)`. The pair `(origin_station, destination_station)` is unique. Each row means `trip_total` trains ran that day from the origin to the destination.

A station's activity is the total number of trains that either started at it or ended at it. Return `station_id` for the station(s) with the highest activity. If several stations tie for the maximum, return every one of them. Row order does not matter.

Input data

Example rows — the live problem includes the full dataset.

train_trips
origin_stationdestination_stationtrip_total
10204
20105
20405
40301
30202

Expected output

Your answer should return 1 row with the columns station_id.

Starter code (Pandas (Python))

import pandas as pd

def busiest_subway_station(train_trips) -> pd.DataFrame:
    # Your code here
    return train_trips

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions