Collection of simple helper functions that abstract some specifics or the raw API.
Streaming bulk consumes actions from the iterable passed in and yields results per action. For non-streaming usecases use bulk() which is a wrapper around streaming bulk that returns summary information about the bulk operation once the entire input is consumed and sent.
This function expects the action to be in the format as returned by search(), for example:
{
'_index': 'index-name',
'_type': 'document',
'_id': 42,
'_parent': 5,
'_ttl': '1d',
'_source': {
...
}
}
Alternatively, if _source is not present, it will pop all metadata fields from the doc and use the rest as the document data.
If you wish to perform other operations, like delete or update use the _op_type field in your actions (_op_type defaults to index):
{
'_op_type': 'delete',
'_index': 'index-name',
'_type': 'document',
'_id': 42,
}
{
'_op_type': 'update',
'_index': 'index-name',
'_type': 'document',
'_id': 42,
'doc': {'question': 'The life, universe and everything.'}
}
Parameters: |
|
---|
Helper for the bulk() api that provides a more human friendly interface - it consumes an iterator of actions and sends them to elasticsearch in chunks. It returns a tuple with summary information - number of successfully executed actions and either list of errors or number of errors if stats_only is set to True.
See streaming_bulk() for more information and accepted formats.
Parameters: |
|
---|
Any additional keyword arguments will be passed to streaming_bulk() which is used to execute the operation.
Simple abstraction on top of the scroll() api - a simple iterator that yields all hits as returned by underlining scroll requests.
By default scan does not return results in any pre-determined order. To have a standard order in the returned documents (either by score or explicit sort definition) when scrolling, use preserve_order=True. This may be an expensive operation and will negate the performance benefits of using scan.
Parameters: |
|
---|
Any additional keyword arguments will be passed to the initial search() call:
scan(es,
query={"match": {"title": "python"}},
index="orders-*",
doc_type="books"
)
Reindex all documents from one index that satisfy a given query to another, potentially (if target_client is specified) on a different cluster. If you don’t specify the query you will reindex all the documents.
Note
This helper doesn’t transfer mappings, just the data.
Parameters: |
|
---|