AnalystPath

Collaboration Reach of Each Researcher

PandasHardSenior level~10 min

Problem

You are given a DataFrame `coauthored` (from `coauthored.csv`) with columns `author_a` and `author_b`. Each row records that two researchers co-authored a paper together; the relationship is symmetric (if A worked with B, then B worked with A).

For each researcher, compute their collaboration reach: the number of distinct researchers they have co-authored with, expressed as a percentage of the total number of researchers who appear anywhere in the data, rounded to 2 decimal places. Return columns `author_a` (the researcher) and `reach_pct`, ordered by `author_a` ascending.

Input data

Example rows — the live problem includes the full dataset.

coauthored
author_aauthor_b
21
13
41
15

Expected output

Your answer should return 5 rows with the columns author_a, reach_pct.

Starter code (Pandas (Python))

import pandas as pd

def collaboration_reach_of_each_researcher(coauthored) -> pd.DataFrame:
    # Your code here
    return coauthored

Solve this Pandas question free

Write Pandas (Python) and run it instantly in your browser — even on your phone. No signup needed to try.

Solution & explanation

Create a free account to unlock the optimal solution, a step-by-step explanation, and the hidden test cases that grade your answer.

Sign up free to unlock

Related Pandas questions