Groupby - Transformations & Filtrations

b. Transformations -  This function lets you change the data elements into some other value.

Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(gp['Humidity'].transform(lambda x: x*100))
Output:-
0 340.0 1 230.0 2 320.0 3 470.0 4 580.0 5 810.0 6 320.0 7 350.0 8 730.0 9 110.0 10 120.0 11 230.0


c.Filtrations
Filtration filters out the data that you want based on some criteria. The filter() function is used to filter the data.
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(df.groupby('State').filter(lambda x:len(x)>=3))
Output:-
   Weather State  Year  Humidity
0    Rainy    CG  2014        3.4
1   Stormy    AP  2015       2.3
6   Cloudy    CG  2016       3.2
8   Stormy    AP  2016       7.3
10   Sunny    CG  2015      1.2
11   Sunny    AP  2017       2.3
Explanation:-
The above filter function gives data about states that has more than 3 records for
weather and humidity.
Quiz:-
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(gp.transform(lambda x: x*10))
#Post your answer in comments
2. import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(df['Humidity'].transform([np.sqrt, np.exp]))
#Post your answer in comments
3.import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print((df.groupby('State'))['Weather'].filter(lambda x: len(x)>=3))
#Post your answer in comments
Source:- https://www.tutorialspoint.com/python_pandas/python_pandas_groupby.htm

Comments

Popular posts from this blog

Descriptive statistics - mode(), mean() and median()

Python Tokens

Python Tokens - Operators