Groupby- Applying a Function
2. Applying a function
Three types of operation can be performed with groupby function:
a. Aggregation
b. Transformation
c. Filteration
a. Aggregation
An aggregated function returns a single aggregated value for each group.
Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')
print(grouped['Humidity'].agg(np.sum))
Output:-
Three types of operation can be performed with groupby function:
a. Aggregation
b. Transformation
c. Filteration
a. Aggregation
An aggregated function returns a single aggregated value for each group.
Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')
print(grouped['Humidity'].agg(np.sum))
Output:-
Year
2014 13.5
2015 16.3
2016 10.5
2017 5.8
Explanation:-
- import pandas and numpy library to be able to create and use dataframe object and groupby, agg
- First Dataframe is created with weather data using pandas DataFrame function.
- To observe data in groups, group it using categorical data such 'Year'.
- To find the total humidity in each year use np.sum.
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy','Sunny', 'Cloudy', 'Rainy',
'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(df.groupby(['Year']).mean())
Output:-
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy','Sunny', 'Cloudy', 'Rainy',
'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(df.groupby(['Year']).mean())
Output:-
Year Humidity
2014 3.375
2015 4.075
2016 5.250
2017 2.900
Explanation:-
- Groupby function is used to obviously group data by year
- And mean() function of pandas is being used this time to find mean humidity for each year.
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(grouped.agg(np.size))
#Post your answer in comment
2. import pandas as pd
2. import pandas as pd
import numpy as npweather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')print(grouped.agg(np.size))
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')print(grouped.agg(np.size))
#Post your answer in comment
3. import pandas as pd
3. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
grouped = df.groupby('Year')
print(grouped['Year'].agg([np.sum, np.mean, np.std]))
Comments
Post a Comment