'Is it possible to use SQLite in EFS reliably?

Is it possible to use SQLite in AWS EFS safely? In my readings trying to determine if this is viable there appears to be some allusions that it should be doable since AWS EFS implemented NFSv4 back in 2017. In practice I am having no luck getting consistent behavior out of it.

Quick Points:

  1. "Just use AWS RDS": Due to issues with other AWS architecture another team has implemented we are trying to work around resource starving cause by the API (DynamoDB isn't an option)
  2. "This goes against SQLite's primary use case (being a locally access DB): Yes, but given the circumstances it seems like the best approach.
  3. I have verified that we are running nfsv4 on our EC2 instance

Current results are very inconsistent with 3 exceptions encountered irrespective of approach I use

  1. "file is encrypted or is not a database"
  2. "disk I/O error (potentially related to EFS open file limits)"
  3. "database disk image is malformed" (The database actually isn't corrupted after this)

database code:

SQLITE_VAR_LIMIT = 999
dgm_db_file_name = ''
db = SqliteExtDatabase(None)
lock_file = f'{os.getenv("efs_path", "tmp")}/db_lock_file.lock'


def lock_db_file():
    with open(lock_file, 'w+') as lock:
        limit = 900
        while limit:
            try:
                fcntl.flock(lock, fcntl.LOCK_EX | fcntl.LOCK_NB)
                print(f'db locked')
                break
            except Exception as e:
                print(f'Exception: {str(e)}')
                limit -= 1
                time.sleep(1)

    if not limit:
        raise ValueError(f'Timed out after 900 seconds while waiting for database lock.')


def unlock_db_file():
    with open(lock_file, 'w+') as lock:
        fcntl.flock(lock, fcntl.LOCK_UN)
        print(f'db unlocked')


def initialize_db(db_file_path=dgm_db_file_name):
    print(f'Initializing db ')
    global db
    db.init(db_file_path, pragmas={
        'journal_mode': 'wal',
        'cache_size': -1 * 64000,  # 64MB
        'foreign_keys': 1})
    print(f'db initialized')


class Thing(Model):
    name = CharField(primary_key=True)
    etag = CharField()
    last_modified = CharField()

    class Meta:
        database = db

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)   

    @staticmethod
    def insert_many(stuff):
        data = [(k, v['ETag'], v['Last-Modified']) for k, v in stuff.items()]
        fields = [Thing.name, Thing.etag, Thing.last_modified]
        limit = 900
        while True:
            try:
                with db.atomic():
                    for key_batch in chunked(data, SQLITE_VAR_LIMIT // len(fields)):
                        s = Thing.insert_many(key_batch, fields=[Thing.name, Thing.etag, Thing.last_modified]) \
                            .on_conflict_replace().execute()
                break
            except Exception as e:
                print(f'Exception: {str(e)}')
                print(f'Will try for {limit} more seconds.')
                limit -= 1
                time.sleep(1)

        if not limit:
            raise ValueError('Failed to exectue query after 900 seconds.')

Example Call:

        print(f'Critical section start')
        # lock_db_file() # I have tried with a secondary lock file as well
        self.stuff_db = Thing()
        if not Path(self.db_file_path).exists():
            initialize_db(self.db_file_path)
            print('creating tables')
            db.create_tables([Thing], safe=True)
        else:
            initialize_db(self.db_file_path)

        getattr(Thing, insert_many)(self.stuff_db, stuff_db)
        # db.close()
        # unlock_db_file()
        print(f'Critical section end')
        print(f'len after update: {len(stuff)}')

Additional peculiarities:

  • If a lamda gets stuck catching the "malformed image" exception and a new lambda execution is triggered, the error resolves in the other lambda.


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source