logo

GroupBy - Splitting Data into Groups

GroupBy answers questions like "What's the average salary by department?" or "Total sales per region?"

It follows a split-apply-combine pattern: split data into groups, apply a function, combine results.

df.groupby('department')['salary'].mean()

This groups rows by department, then calculates mean salary for each group.

The result is a Series with department names as the index and mean salaries as values.

You can group by multiple columns:

df.groupby(['department', 'year'])['sales'].sum()

GroupBy is one of pandas' most powerful features for data analysis.

I dedicate multiple lessons to GroupBy in The Ultimate Pandas Bootcamp.