'MonetDB Crashes When Connecting. Will Not Restart. Large Logs

I recently migrated an instance of MonetDB to a machine with a more memory and a larger hard drive by transferring a machine image from one location to another. The database worked briefly but now when attempting to connect with mclient I get the following error:

$ mclient warehouse
user(monetdb):monetdb
password:
monetdbd: internal error while starting mserver 'database 'warehouse' has crashed after starting, manual intervention needed, check monetdbd's logfile (merovingian.log) for details'

Upon checking merovingian.log I can see that the database started, but crashed when mclient attempted to connect:

2022-02-09 14:11:29 MSG merovingian[4368]: database 'warehouse' has crashed after start on 2022-02-09 14:03:23, attempting restart, up min/avg/max: 2m/6w/22w, crash average: 1.00 1.00 0.97 (135-6=129)

The database then attempts to restart and appears normal until attempting to read through something referred to the write-ahead log.

2022-02-09 14:11:29 MSG warehouse[4403]: arguments:
2022-02-09 14:11:29 MSG warehouse[4403]: /usr/bin/mserver5 --dbpath=/home/db_user/monetDBDatabase/warehouse/warehouse --set merovingian_uri=mapi:monetdb://dev:50000/warehouse --set mapi_listenaddr=none --set mapi_usock=/home/db_user/monetDBDatabase/warehouse/warehouse/.mapi.sock --set monet_vault_key=/home/db_user/monetDBDatabase/warehouse/warehouse/.vaultkey --set
2022-02-09 14:11:29 MSG warehouse[4403]: gdk_nr_threads=36 --set max_clients=64 --set sql_optimizer=default_pipe
2022-02-09 14:11:40 MSG warehouse[4403]: # MonetDB 5 server v11.39.17 (Oct2020-SP5)
2022-02-09 14:11:40 MSG warehouse[4403]: # Serving database 'warehouse', using 36 threads
2022-02-09 14:11:40 MSG warehouse[4403]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers
2022-02-09 14:11:40 MSG warehouse[4403]: # Found 125.609 GiB available main-memory of which we use 102.371 GiB
2022-02-09 14:11:40 MSG warehouse[4403]: # Copyright (c) 1993 - July 2008 CWI.
2022-02-09 14:11:40 MSG warehouse[4403]: # Copyright (c) August 2008 - 2021 MonetDB B.V., all rights reserved
2022-02-09 14:11:40 MSG warehouse[4403]: # Visit https://www.monetdb.org/ for further information
2022-02-09 14:11:40 MSG warehouse[4403]: # Listening for UNIX domain connection requests on mapi:monetdb:///home/db_user/monetDBDatabase/warehouse/warehouse/.mapi.sock
2022-02-09 14:11:40 MSG warehouse[4403]: # still reading write-ahead log "/home/db_user/monetDBDatabase/warehouse/warehouse/sql_logs/sql/log.95638" (22% done)
2022-02-09 14:11:51 MSG warehouse[4403]: # still reading write-ahead log "/home/db_user/monetDBDatabase/warehouse/warehouse/sql_logs/sql/log.95638" (44% done)
2022-02-09 14:12:02 MSG warehouse[4403]: # still reading write-ahead log "/home/db_user/monetDBDatabase/warehouse/warehouse/sql_logs/sql/log.95638" (67% done)

This process continues until the system appears to run out of virtual memory as seen in the log extract below:

2022-02-09 14:18:52 MSG warehouse[4403]: # still reading write-ahead log "/home/db_user/monetDBDatabase/warehouse/warehouse/sql_logs/sql/log.95650" (92% done)
2022-02-09 14:18:56 ERR warehouse[4403]: #main thread: GDKmmap: !ERROR: requested too much virtual memory; memory requested: 43317329920, memory in use: 208390400, virtual memory in use: 4354888091904

I've monitored the system throughout this and the memory usage is well within acceptable limits with the most being used about 1GB out of 126GB. What is strange is the size of the files being produced by MonetDB.

The original database was about 5Tb in size, although I can also see that the write-ahead log(s) found in warehouse/sql_logs/sql are 65GB with another file warehouse/mdbtrace.log taking up another 5Tb.

If I try to remove the write-ahead log(s) then the database does not start citing that the files are missing and the same for mdbtrace.log (I can recreate and post the exact messages if required). Other than this I've tried rebooting the machine. I get the impression that the large size of mdbtrace.log is preventing the virtual memory space to read the write-ahead log(s) by using up space that is needed for virtual memory.

Any assistance in solving these errors so that I can start and connect to the database with mclient would be most appreciated.

Regards,

James



Solution 1:[1]

First of all, the file mdbtrace.log is not part of the database, so it can be (re)moved. What is part of the database is the write-ahead log, the files in the sql_logs directory. Those are the ones that are being processed during startup.

Solution 2:[2]

As Sjoerd pointed out above the mtrace.log file can be safely removed to make room on a disk. I completed this and got the following error in merovingian.log:

2022-02-09 15:57:08 ERR control[8803]: !monetdbd: an internal error has occurred 'unknown or impossible state: 4'
2022-02-09 15:57:13 ERR merovingian[8803]: client error: unknown or impossible state: 4

This caused a shutdown and was solved by rebooting the machine. When attempting to connect with mclient from this point onwards the same errors as in the original post happened with the write-ahead log(s).

On investigation I noticed a file called log in warehouse/sql_logs/sql that had the following structure:

052205

95662

Previously the range of the log numbers found in this directory was from 95638 to 95662 and modifying the second number to take a value closer to 95662 caused these write-ahead log(s) to be skipped when mclient attempted to connect to a recently started instance of monetdbd meaning that the virtual memory was not filled and that the client could connect and query the database as normal.

NOTE THAT SOME DATA LOSS WAS OBSERVED AFTER TAKING THIS ACTION

However considering that the alternative would have been deleting 5Tb of data and reloading everything from the original CSV's, a process of several weeks, this solution was welcome. Use at your own risk.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sjoerd Mullender
Solution 2 James Scott