'How can I copy Elasticsearch data to a new server?

I installed Elasticsearch on a new server. I have an index called metrics where custom metrics from my apps get pushed. What I want to do is add the metrics from my old server to this index.

I've tried to use elasticdump on the old server like so: elasticdump --input=http://oldserver:9200/metrics --output=metrics_dump.json --type=data but after around 1,000,000 entries being added to the file, I get an error (metrics has around 10,000,000 entries on the old server). So I thought to use scroll and save the entries in batches: elasticdump --input=http://oldserver:9200/metrics --output=metrics_dump_1.json --type=data --searchBody='{ "sort": ["_doc"], "query": { "match_all": {} }, "size": 100000 } but this doesn't work. Entries keep getting written to the file after the 100,000 mark. Also, on inspecting the first lines in the output file, I don't see a scroll_id, so I suspect the searchBody argument is ignored.

Any (other) way I can move this data? I need to not lose any newer entries from the new server metrics.



Solution 1:[1]

In addition to the answer posted by @julodnik, one of the easiest ways to copy data is to make your new ES cluster join the old/existing ES cluster. This can be done by configuring properties in elasticsearch.yml in the new cluster to be similar to the existing one especially the properties master nodes (discovery.zen.ping.unicast.hosts)and cluster name ('cluster.name`). That way the new cluster will join the existing cluster and data will be evenly balanced. You can then exclude the old cluster using

PUT _cluster/settings
{
  "transient" : {
    "cluster.routing.allocation.exclude._ip" : "IP_of_old_cluster"
  }
}

before shutting it down.

Another option is to set up logstash to read index metrics and write to ES. You need to configure logstash.conf for it such that it reads entire metrics index. Something like

## SOURCE
input {
  elasticsearch {
    hosts => ["http://your_old_es_host:9200"]
    user => "elastic"
    password => "foopass"
    index => "metrics"
    scroll => "5m"
    size => 5000
    docinfo_fields => [ "_id" ]
    query => '{"query":
               {"match_all": {}
                }}'
  }
}

## TARGET
output {
  elasticsearch {
    hosts => ["http://your_new_es_host:9200"]
    index => "metrics-new"
    user => "elastic"
    password => "foopass"
  }
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sandeep Kanabar