AnalystPath

Most-Mentioned Features in Reviews

PandasHardSenior level~10 min

Problem

You are given a DataFrame `reviews` (CSV `reviews.csv`) with columns reviewer_id, review_id, posted_on, and body. Each body may mention zero or more product features written as an `@` token made only of letters, digits, and underscores (for example `@battery` or `@screen_size`). The same feature can appear several times in one body and every mention counts. Considering only reviews whose posted_on falls in June 2025 (2025-06-01 to 2025-06-30 inclusive), return the three most-mentioned features. Output columns: `feature` (the token including its leading `@`) and `mentions` (total times mentioned), ordered by `mentions` descending then by `feature` descending, keeping only the top three.

Input data

Example rows — the live problem includes the full dataset.

reviews
reviewer_idreview_idposted_onbody
20112025-06-01Love the @battery and the @screen
20222025-06-03The @battery lasts forever @charging
20332025-06-04Crisp @display and great @camera
20442025-06-04Solid @audio plus nice @design
20552025-06-05Best @battery I have owned @value

Expected output

Your answer should return 3 rows with the columns feature, mentions.

Starter code (Pandas (Python))

import pandas as pd

def most_mentioned_features(reviews) -> pd.DataFrame:
    # Your code here
    return reviews

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions