Groupby- Applying a Function

2. Applying a function
Three types of operation can be performed with groupby function:
a. Aggregation
b. Transformation
c. Filteration
a. Aggregation
An aggregated function returns a single aggregated value for each group. 
Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')
print(grouped['Humidity'].agg(np.sum))
Output:-
Year 2014 13.5 2015 16.3 2016 10.5 2017 5.8
Explanation:-
  • import pandas and numpy library to be able to create and use dataframe object and groupby, agg
and sum function.
  • First Dataframe is created with weather data using pandas DataFrame function.
  • To observe data in groups, group it using categorical data such 'Year'.
  • To find the total humidity in each year use np.sum.
Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy','Sunny', 'Cloudy', 'Rainy',
'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(df.groupby(['Year']).mean())
Output:-
Year Humidity 2014 3.375 2015 4.075 2016 5.250 2017 2.900
Explanation:-
  • Groupby function is used to obviously group data by year
  • And mean() function of pandas is being used this time to find mean humidity for each year.
Quiz:-
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(grouped.agg(np.size))
#Post your answer in comment
2. import pandas as pd
import numpy as npweather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)

grouped = df.groupby('Year')print(grouped.agg(np.size))
#Post your answer in comment
3. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')
print(grouped['Year'].agg([np.sum, np.mean, np.std]))

Comments

Popular posts from this blog

Descriptive statistics - mode(), mean() and median()

Python Tokens

Python Tokens - Operators