Posts

Showing posts from July, 2019

Descriptive Statistics - count & sum

F. Count() - counts the non-NA entries for each row and column. Values None, Nat, NaN are considered as NA in pandas. Example:-  1. import pandas as pd df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000},2019:{'q1':True,'q2':'False'}}) print(df2.count()) Output:- 2016    4 2017    4 2018    2 2019    2 2.  import pandas as pd df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000},2019:{'q1':True,'q2':'False'}}) print(df2.count(numeric_only=True)) Output:- 2016 4 2018 2 G. Sum() - Returns the sum of the values for

Descriptive statistics - mode(), mean() and median()

C. mode() - returns the value that appears most from a set of values. Example:- 1. import pandas as pd import pandas as pd df2 = pd.DataFrame({2016:{'q1':500,'q2':500,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'A','q3':'A','q4':'D'},2018:{'q1':54500,'q2':51000}}) df2.mode() Output:- 2016 2017 2018 0 500.0 A 51000.0 1 NaN NaN 54500.0 Explanation:- Since by default axis=0 so mode is calculated among rows(indexes) i.e for each column. 2. df2.mode(axis=1) 0 1 2 q1 500 A 54500.0 q2 500 A 51000.0 q3 47000 A NaN q4 49000 D NaN Explanation:- Since axis=1 so the mode is calculated among columns i.e for each row. 3.  df2.mode(numeric_only=True) Output:- 2016 2018 0 500.0 51000.0 1 NaN 54500.0 Explanation: With numeric_only=True only numeric values are included for mode calculation and String/Text values are not considered. By

Descriptive Statistics - min(), max()

A. min() - Find out the minimum and maximum out of a given set of data. Example:- import pandas as pd df = pd.DataFrame({2016:{'q1':34500,'q2':56000,'q3':47000,'q4':49000},2017:\{'q1':44900,'q2':46100,'q3':57000,'q4':59000},2018:{'q1':54500,'q2':51000}}) print(df) Output:- 2016 2017 2018 q1 34500 44900 54500.0 q2 56000 46100 51000.0 q3 47000 57000 NaN q4 49000 59000 NaN 1. print(df.min()) Output:- 2016 34500.0 2017 44900.0 2018 51000.0 Explanation: min() finds the minimum among the indexes for each column and axis 0 by default. 2. print(df.min(axis=1)) Output:- q1 34500.0 q2 46100.0 q3 47000.0 q4 49000.0 Explanation: min(axis=1) finds the minimum among the columns for each indexes. import pandas as pd df2 = pd.DataFrame({2016:{'q1':34500,'q2':56000,'q3':47000,'q4':49000},2017:{'q1':'A','q2':'B

Python Tokens - Operators

Operators - Tokens that trigger computation when applied to variables and other objects in a expression. Variable and expression to which the operators are applied are called operands. Example:- 2+3 here 2,3 are operands and + is operator. 1. Unary Operators -  It operates on only one operand. Example:- Operator Name Example + Unary Plus +10, 23E+2 - Unary Minus -4.3, -2.3E-1 ~ Bitwise complement ~3 not Logical negation not 2 2. Binary Operators - It requires two operators to operate upon. Operator Name Example + Addition 2+3 = 5 - Subtraction 2-3=-1 * Multiplication 2*3=6 / Division 2/3 = 0.66666666 % Remainder 2%3 = 2 ** Exponent 2**3 = 8 // Floor division 55//2 = 27