Groupby - Transformations & Filtrations
b. Transformations - This function lets you change the data elements into some other value.
Example:-
import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(gp['Humidity'].transform(lambda x: x*100))
Output:-
0 340.0
1 230.0
2 320.0
3 470.0
4 580.0
5 810.0
6 320.0
7 350.0
8 730.0
9 110.0
10 120.0
11 230.0
The above filter function gives data about states that has more than 3 records for
weather and humidity.
c.Filtrations
Filtration filters out the data that you want based on some criteria. The filter() function is used to filter the data.
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print(df.groupby('State').filter(lambda x:len(x)>=3))
Output:-
Weather State Year Humidity
0 Rainy CG 2014 3.4
1 Stormy AP 2015 2.3
6 Cloudy CG 2016 3.2
8 Stormy AP 2016 7.3
10 Sunny CG 2015 1.2
11 Sunny AP 2017 2.3
Explanation:-The above filter function gives data about states that has more than 3 records for
weather and humidity.
Quiz:-
1. import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(gp.transform(lambda x: x*10))
#Post your answer in comments
2. import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
gp = df.groupby('Year')
print(df['Humidity'].transform([np.sqrt, np.exp]))
#Post your answer in comments
3.import pandas as pd
import numpy as np
Source:- https://www.tutorialspoint.com/python_pandas/python_pandas_groupby.htm
3.import pandas as pd
import numpy as np
weather_data = {'Weather': ['Rainy', 'Stormy', 'Sunny', 'Cloudy', 'Rainy',
'Sunny', 'Cloudy', 'Rainy', 'Stormy', 'Cloudy', 'Sunny', 'Sunny'],
'State': ['CG', 'AP', 'HP', 'MP', 'HY','DH' ,'CG' ,'HP','AP' , 'MP','CG','AP'],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Humidity':[3.4,2.3,3.2,4.7,5.8,8.1,3.2,3.5,7.3,1.1,1.2,2.3]}
df = pd.DataFrame(weather_data)
print((df.groupby('State'))['Weather'].filter(lambda x: len(x)>=3))
#Post your answer in comments
Comments
Post a Comment