eland.groupby.DataFrameGroupBy.mean

DataFrameGroupBy.mean(numeric_only: bool = True) → pd.DataFrame

Compute the mean value for each group.

Parameters
numeric_only: {True, False, None} Default is True

Which datatype to be returned - True: Returns all values as float64, NaN/NaT values are removed - None: Returns all values as the same dtype where possible, NaN/NaT are removed - False: Returns all values as the same dtype where possible, NaN/NaT are preserved

Returns
pandas.DataFrame

mean value for each numeric column of each group

Examples

>>> df = ed.DataFrame(
...   "localhost", "flights",
...   columns=["AvgTicketPrice", "Cancelled", "dayOfWeek", "timestamp", "DestCountry"]
... )
>>> df.groupby("DestCountry").mean(numeric_only=False) # doctest: +SKIP
             AvgTicketPrice  Cancelled  dayOfWeek                     timestamp
DestCountry
AE               605.132970   0.152174   2.695652 2018-01-21 16:58:07.891304443
AR               674.827252   0.147541   2.744262 2018-01-21 22:18:06.593442627
AT               646.650530   0.175066   2.872679 2018-01-21 15:54:42.469496094
AU               669.558832   0.129808   2.843750 2018-01-22 02:28:39.199519287
CA               648.747109   0.134534   2.951271 2018-01-22 14:40:47.165254150
...                     ...        ...        ...                           ...
RU               662.994963   0.131258   2.832206 2018-01-21 07:11:16.534506104
SE               660.612988   0.149020   2.682353 2018-01-22 07:48:23.447058838
TR               485.253247   0.100000   1.900000 2018-01-16 16:02:33.000000000
US               595.774391   0.125315   2.753900 2018-01-21 16:55:04.456970215
ZA               643.053057   0.148410   2.766784 2018-01-22 15:17:56.141342773
<BLANKLINE>
[32 rows x 4 columns]