Debut Harvest of Each Wine
Problem
DataFrame: `vintage` (`vintage.csv`)
```text
+--------------+------+
| Column | Type |
+--------------+------+
| vintage_id | int |
| wine_id | int |
| harvest_year | int |
| bottles | int |
| price | int |
+--------------+------+
vintage_id uniquely identifies each row.
Each row is one bottling of a wine made in a given harvest year.
```
For every wine, find its **debut** harvest year — the earliest `harvest_year` that wine was ever bottled. Return one row per bottling produced in that debut year, with columns `wine_id`, `first_year` (the debut year), `bottles`, and `price`. If two bottlings of the same wine share the debut year, return both.
Return the rows in any order.
**Example**
Input — `vintage`:
```text
vintage_id wine_id harvest_year bottles price
1 100 2008 10 50
2 100 2009 12 50
7 200 2011 15 90
```
Output:
```text
wine_id first_year bottles price
100 2008 10 50
200 2011 15 90
```
Wine `100`'s earliest year is 2008, so only that bottling is kept; wine `200` has a single year, 2011.
Input data
Example rows — the live problem includes the full dataset.
| vintage_id | wine_id | harvest_year | bottles | price |
|---|---|---|---|---|
| 1 | 100 | 2008 | 10 | 50 |
| 2 | 100 | 2009 | 12 | 50 |
| 7 | 200 | 2011 | 15 | 90 |
Expected output
Your answer should return 2 rows with the columns wine_id, first_year, bottles, price.
Starter code (Pandas (Python))
import pandas as pd
def debut_harvest(vintage: pd.DataFrame) -> pd.DataFrame:
# Your code here
return vintageSolve this Pandas question free
Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.
Solution & explanation
Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.
Sign up free to unlock