how to get variance value in pyspark dataframe
# here data is a pyspark.sql dataframe and balance is it's column
# get the mean value of a column
data.agg({'balance': 'mean'}).show()
#or
# get the max value of a column
data.agg({'balance': 'avg'}).show()
------other related function ------
few possible parameters are
['max', 'min', 'stddev', 'variance', 'count', 'skewness', 'kurtosis', 'sum']
# get the max value of a column
data.agg({'balance': 'max'}).show()
# get the min value of a column
data.agg({'balance': 'min'}).show()
# get the standard deviation of a column
data.agg({'balance': 'stddev'}).show()
# get the variance of a column
data.agg({'balance': 'variance'}).show()