Skip to content

Latest commit

 

History

History
142 lines (107 loc) · 3.75 KB

indexing-operations.asciidoc

File metadata and controls

142 lines (107 loc) · 3.75 KB

Indexing Operations

Indexing is very easy in the client. Since associative arrays can easily be converted into JSON documents, indexing documents is simply a matter of providing the correctly structured associative array and calling a method.

Single document indexing

When indexing a document, you can either provide an ID or let elasticsearch generate one for you.


Providing an ID value
$params = array();
$params['body']  = array('testField' => 'abc');

$params['index'] = 'my_index';
$params['type']  = 'my_type';
$params['id']    = 'my_id';

// Document will be indexed to my_index/my_type/my_id
$ret = $client->index($params);


Omitting an ID value
$params = array();
$params['body']  = array('testField' => 'abc');

$params['index'] = 'my_index';
$params['type']  = 'my_type';

// Document will be indexed to my_index/my_type/<autogenerated_id>
$ret = $client->index($params);


Like most of the other APIs, there are a number of other parameters that can be specified. They are specified in the parameter array just like index or type. For example, let’s set the routing and timestamp of this new document:

Additional parameters
$params = array();
$params['body']  = array('testField' => 'xyz');

$params['index']     = 'my_index';
$params['type']      = 'my_type';
$params['routing']   = 'company_xyz';
$params['timestamp'] = strtotime("-1d");

$ret = $client->index($params);


Bulk Indexing

Elasticsearch also supports bulk indexing of documents. The client provides an interface to bulk index too, but it is less user-friendly. In the future we will be adding "helper" methods that simplify this process.

The bulk API method expects a bulk body identical to the kind elasticsearch expects: JSON action/metadata pairs separated by new lines. A common bulk-creation pattern is as follows:

Bulk indexing with PHP arrays
for($i = 0; $i < 100; $i++) {
    $params['body'][] = array(
        'index' => array(
            '_id' => $i
        )
    );

    $params['body'][] = array(
        'my_field' => 'my_value',
        'second_field' => 'some more values'
    );
}

$responses = $client->bulk($params);

You can of course use any of the available bulk methods. Here is an example of using upserts:

Bulk upserting with PHP arrays
for($i = 0; $i < 100; $i++) {
    $params['body'][] = array(
        'update' => array(
            '_id' => $i
        )
    );

    $params['body'][] = array(
        'doc_as_upsert' => 'true',
        'doc' => array(
            'my_field' => 'my_value',
            'second_field' => 'some more values'
        )
    );
}

$responses = $client->bulk($params);

Bulk updating with Nowdocs

If you are specifying bulks manually or extracting them from an existing JSON file, Nowdocs are probably the best method. Otherwise, when you construct them algorithmically, take care to ensure newlines ("\n") separates all lines…​including the last!

Bulk indexing
$params = array();
$params['body']  = <<<'EOT'
{ "index" : { "_index" : "my_index", "_type" : "my_type", "_id" : "1" } }
{ "field1" : "value1" }

EOT;

$ret = $client->bulk($params);


Like the Bulk API, if you specify the index/type in the parameters, you can omit it from the bulk request itself (which often saves a lot of space and redundant data transfer):

Bulk indexing w/ explicit index/type
$params = array();
$params['body']  = <<<'EOT'
{ "index" : { "_id" : "1" } }
{ "field1" : "value1" }

EOT;

$params['index'] = 'my_index';
$params['type']  = 'my_type';

$ret = $client->bulk($params);