logo

Filling Missing Data

Instead of dropping, you can fill missing values with substitutes.

Fill with a constant:

df['status'].fillna('unknown')

Fill with the column mean:

df['age'].fillna(df['age'].mean())

Fill with the previous value (forward fill):

df['price'].fillna(method='ffill')

Fill with the next value (backward fill):

df['price'].fillna(method='bfill')

The right fill strategy depends on your data. Means work for random gaps; forward fill works for time series where missing means "same as before."

I explain fill strategies in The Ultimate Pandas Bootcamp.