eland.
pandas_to_eland
Append a pandas DataFrame to an Elasticsearch index. Mainly used in testing. Modifies the elasticsearch destination index
elasticsearch-py parameters or
elasticsearch-py instance
Name of Elasticsearch index to be appended to
How to behave if the index already exists.
fail: Raise a ValueError.
replace: Delete the index before inserting new values.
append: Insert new values to the existing index. Create if does not exist.
Refresh es_dest_index after bulk index
True: Remove missing values (see pandas.Series.dropna)
False: Include missing values - may cause bulk to fail
Dict of field_name: es_data_type that overrides default es data types
Number of pandas.DataFrame rows to read before bulk index into Elasticsearch
True: pandas.DataFrame.index fields will be used to populate Elasticsearch ‘_id’ fields.
False: Ignore pandas.DataFrame.index when indexing into Elasticsearch
eland.DataFrame referencing data in destination_index
See also
eland.eland_to_pandas
Create a pandas.Dataframe from eland.DataFrame
Examples
>>> pd_df = pd.DataFrame(data={'A': 3.141, ... 'B': 1, ... 'C': 'foo', ... 'D': pd.Timestamp('20190102'), ... 'E': [1.0, 2.0, 3.0], ... 'F': False, ... 'G': [1, 2, 3], ... 'H': 'Long text - to be indexed as es type text'}, ... index=['0', '1', '2']) >>> type(pd_df) <class 'pandas.core.frame.DataFrame'> >>> pd_df A B ... G H 0 3.141 1 ... 1 Long text - to be indexed as es type text 1 3.141 1 ... 2 Long text - to be indexed as es type text 2 3.141 1 ... 3 Long text - to be indexed as es type text <BLANKLINE> [3 rows x 8 columns] >>> pd_df.dtypes A float64 B int64 C object D datetime64[ns] E float64 F bool G int64 H object dtype: object
Convert pandas.DataFrame to eland.DataFrame - this creates an Elasticsearch index called pandas_to_eland. Overwrite existing Elasticsearch index if it exists if_exists=”replace”, and sync index so it is readable on return refresh=True
>>> ed_df = ed.pandas_to_eland(pd_df, ... 'localhost', ... 'pandas_to_eland', ... es_if_exists="replace", ... es_refresh=True, ... es_type_overrides={'H':'text'}) # index field 'H' as text not keyword >>> type(ed_df) <class 'eland.dataframe.DataFrame'> >>> ed_df A B ... G H 0 3.141 1 ... 1 Long text - to be indexed as es type text 1 3.141 1 ... 2 Long text - to be indexed as es type text 2 3.141 1 ... 3 Long text - to be indexed as es type text <BLANKLINE> [3 rows x 8 columns] >>> ed_df.dtypes A float64 B int64 C object D datetime64[ns] E float64 F bool G int64 H object dtype: object