Descriptive statistics - mode(), mean() and median()

C. mode() - returns the value that appears most from a set of values.
Example:-
1. import pandas as pd
import pandas as pd
df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000}})
df2.mode()
Output:-
201620172018
0500.0A51000.0
1NaNNaN54500.0






Explanation:- Since by default axis=0 so mode is calculated among rows(indexes) i.e for each column.

2. df2.mode(axis=1)
012
q1500A54500.0
q2500A51000.0
q347000ANaN
q449000DNaN

Explanation:- Since axis=1 so the mode is calculated among columns i.e for each row.

3. df2.mode(numeric_only=True)
Output:-
20162018
0500.051000.0
1NaN54500.0
Explanation:
With numeric_only=True only numeric values are included for mode calculation and String/Text values are not considered. By default value is False.

D. mean() - returns the average from a set of values.
Example:-
import pandas as pd
df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000},2019:{'q1':True,'q2':'False'}})
print(df2)
Output:-
201620172018     2019
q1500A54500.0True
q2500A51000.0False
q347000ANaNNaN
q449000DNaNNaN
print(df2.mean())
Output:-
2016    24250.0
2018    52750.0
Explanation:-
mean() calculates the average among the indexes i.e for each column.

print(df2.mean(skipna=False))
Output:-
2016    24250.0
2018        NaN
Explanation:-
With skipna=False the NaN values are also included in the calculation.
E. median() - returns the median value for the requested axis.
Example:-
import pandas as pd
df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000},2019:{'q1':True,'q2':'False'}})
print(df2)
print(df2.median())
Output:-
2016    23750.0
2018    52750.0
Do it yourself:-
print(df2.mean(axis=1))
print(df2.mean(axis=1,skipna=False,numeric_only=True))
print(df2.median(axis=1))
print(df2.median(skipna=False))
print(df2.median(skipna=False,numeric_only=False))
#Post your answers in comments

Comments

Popular posts from this blog

Python Tokens

Python Tokens - Operators

Descriptive Statistics - count & sum