DataFrame

Constructor

DataFrame([es_client, es_index_pattern, …])

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns) referencing data stored in Elasticsearch indices.

Attributes and underlying data

Axes

DataFrame.index

Return eland index referencing Elasticsearch field to index a DataFrame/Series

DataFrame.columns

The column labels of the DataFrame.

DataFrame.dtypes

Return the pandas dtypes in the DataFrame.

DataFrame.select_dtypes(self[, include, exclude])

Return a subset of the DataFrame’s columns based on the column dtypes.

DataFrame.values

Not implemented.

DataFrame.empty

Determines if the DataFrame is empty.

DataFrame.shape

Return a tuple representing the dimensionality of the DataFrame.

Indexing, iteration

DataFrame.head(self, n)

Return the first n rows.

DataFrame.keys(self)

Return columns

DataFrame.tail(self, n)

Return the last n rows.

DataFrame.get(self, key[, default])

Get item from object for given key (ex: DataFrame column).

DataFrame.query(self, expr)

Query the columns of a DataFrame with a boolean expression.

DataFrame.sample(self, n, frac, random_state)

Return n randomly sample rows or the specify fraction of rows

Function application, GroupBy & window

DataFrame.agg(self, func[, axis])

Aggregate using one or more operations over the specified axis.

DataFrame.aggregate(self, func[, axis])

Aggregate using one or more operations over the specified axis.

Computations / descriptive stats

DataFrame.count(self)

Count non-NA cells for each column.

DataFrame.describe(self)

Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.

DataFrame.info(self[, verbose, buf, …])

Print a concise summary of a DataFrame.

DataFrame.max(self[, numeric_only])

Return the maximum value for each numeric column

DataFrame.mean(self[, numeric_only])

Return mean value for each numeric column

DataFrame.min(self[, numeric_only])

Return the minimum value for each numeric column

DataFrame.median(self[, numeric_only])

Return the median value for each numeric column

DataFrame.mad(self[, numeric_only])

Return standard deviation for each numeric column

DataFrame.std(self[, numeric_only])

Return standard deviation for each numeric column

DataFrame.var(self[, numeric_only])

Return variance for each numeric column

DataFrame.sum(self[, numeric_only])

Return sum for each numeric column

DataFrame.nunique(self)

Return cardinality of each field.

Reindexing / selection / label manipulation

DataFrame.drop(self[, labels, axis, index, …])

Return new object with labels in requested axis removed.

DataFrame.filter(self, items, …)

Subset the dataframe rows or columns according to the specified index labels.

Plotting

DataFrame.hist(data[, column, by, grid, …])

Make a histogram of the DataFrame’s.

Elasticsearch Functions

DataFrame.es_info(self)

A debug summary of an eland DataFrame internals.

DataFrame.es_query(self, query)

Applies an Elasticsearch DSL query to the current DataFrame.

Serialization / IO / conversion

DataFrame.info(self[, verbose, buf, …])

Print a concise summary of a DataFrame.

DataFrame.to_numpy(self)

Not implemented.

DataFrame.to_csv(self[, path_or_buf, sep, …])

Write Elasticsearch data to a comma-separated values (csv) file.

DataFrame.to_html(self[, buf, columns, …])

Render a Elasticsearch data as an HTML table.

DataFrame.to_string(self[, buf, columns, …])

Render a DataFrame to a console-friendly tabular output.

DataFrame.to_pandas(self, show_progress)

Utility method to convert eland.Dataframe to pandas.Dataframe