Most-Mentioned Features in Reviews
Problem
You are given a DataFrame `reviews` (CSV `reviews.csv`) with columns reviewer_id, review_id, posted_on, and body. Each body may mention zero or more product features written as an `@` token made only of letters, digits, and underscores (for example `@battery` or `@screen_size`). The same feature can appear several times in one body and every mention counts. Considering only reviews whose posted_on falls in June 2025 (2025-06-01 to 2025-06-30 inclusive), return the three most-mentioned features. Output columns: `feature` (the token including its leading `@`) and `mentions` (total times mentioned), ordered by `mentions` descending then by `feature` descending, keeping only the top three.
Input data
Example rows — the live problem includes the full dataset.
| reviewer_id | review_id | posted_on | body |
|---|---|---|---|
| 201 | 1 | 2025-06-01 | Love the @battery and the @screen |
| 202 | 2 | 2025-06-03 | The @battery lasts forever @charging |
| 203 | 3 | 2025-06-04 | Crisp @display and great @camera |
| 204 | 4 | 2025-06-04 | Solid @audio plus nice @design |
| 205 | 5 | 2025-06-05 | Best @battery I have owned @value |
Expected output
Your answer should return 3 rows with the columns feature, mentions.
Starter code (Pandas (Python))
import pandas as pd
def most_mentioned_features(reviews) -> pd.DataFrame:
# Your code here
return reviewsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock