Auto-Tagging Support Tickets
Problem
You are given two DataFrames. `category_terms` has columns `category_id` and `term` (the pair is unique). `tickets` has columns `ticket_id` and `message` (`ticket_id` is unique).
Tag each ticket by category. A ticket belongs to a category if at least one of that category's terms appears in the message as a whole word, matched case-insensitively (so `'vpn'` matches `'VPN'` but not `'envpn'`). For each ticket return `ticket_id` and `category`: the matching category ids joined by commas and sorted ascending. If no term matches, `category` is the literal text `'Unclassified'`. Return rows ordered by `ticket_id`.
Input data
Example rows — the live problem includes the full dataset.
| category_id | term |
|---|---|
| 1 | refund |
| 1 | billing |
| 3 | crash |
| 2 | login |
| ticket_id | message |
|---|---|
| 1 | I need a refund for my order |
| 2 | the billing and refund both wrong |
| 3 | app keeps crash and login fails |
| 4 | crashing happened then a refunded note |
Expected output
Your answer should return 4 rows with the columns ticket_id, category.
Starter code (Pandas (Python))
import pandas as pd
def auto_tag_tickets(category_terms, tickets) -> pd.DataFrame:
# Your code here
return ticketsSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock