'Elasticsearch duplicate index function in python fails and creates a RED index

I created a script that uses a bulk query to upload a big quantity of data on an elasticsearch database (the database is a single node with no replicas with the 8.2 version of elasticsearch). Every time i run the script, i want it to make a copy of the previous data into a separate index, in order to have a backup in case of problems with the new data. The import populate function works fine, but at the moment of the backup, sometimes the duplicate fails due to a timeout, and creates an index with the status RED. The script uses the Elasticsearch pip package, and the duplicate function is the following:

   def duplicate_index(self, index_name: str):

        was_not_readonly = self.get_readonly_info()

        # if it wasn't readonly put it in readonly mode
        if not was_not_readonly:
            self.readonly_index()

        try:
            self.connection.indices.clone(index=self.config.INDEX_NAME, target=index_name)
        except Exception as e:
            logging.error("==> UNABLE TO DUPLICATE THE INDEX, AN INDEX WITH THE SAME NAME MIGHT EXIST <==")
            raise e
        finally:
            # if it was not readonly remove the readonly
            if was_not_readonly:
                self.readonly_index(False)

The get_readonly_info is the following:

    def get_readonly_info(self):
        settings = self.connection.indices.get_settings(index=self.config.INDEX_NAME)
        readonly = settings[self.config.INDEX_NAME]['settings']['index']['blocks']['read_only']
        return readonly == "true"

and the readonly_index function is the following:

    def readonly_index(self, readonly: bool = True):
        if readonly:
            logging.warning("==> SETTING THE INDEX TO READONLY <==")
        else:
            logging.warning("==> ENABLING POSSIBILITY TO WRITE ON THE INDEX <==")
        read_only_setting = {"index.blocks.read_only": readonly}
        self.connection.indices.put_settings(settings=read_only_setting, index=self.config.INDEX_NAME)

Most of the time after this error the full index is corrupted and i need to reset completely elasticsearch, because it doesn't allow me anymore to create new indexes.

The biggest problem is that sometimes the duplicate function fails also on the kibana console, with the standard duplicate API. I can't understand why this error occurs just sometimes and not every time.



Solution 1:[1]

I'd suggest a different approach here, use an alias - eg indexname

that way you can have your indices that hold the data under separate names - eg indexname-$date, and then when you want to reindex you create a new one of those, and then switch the indexname alias to point to the new index and keep the old one completely untouched

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mark Walkom