GroupBy - Splitting Data into Groups
GroupBy answers questions like "What's the average salary by department?" or "Total sales per region?"
It follows a split-apply-combine pattern: split data into groups, apply a function, combine results.
df.groupby('department')['salary'].mean()
This groups rows by department, then calculates mean salary for each group.
The result is a Series with department names as the index and mean salaries as values.
You can group by multiple columns:
df.groupby(['department', 'year'])['sales'].sum()
GroupBy is one of pandas' most powerful features for data analysis.
I dedicate multiple lessons to GroupBy in The Ultimate Pandas Bootcamp.