'(Solved) How to read from mongodb collection by pymongo when densely writing into it?

Update:

It is solved. Please check myself's answer if you are interested in it. Thanks to everyone all the same!

My original post:

MongoDB server version: 3.6.8 (WSL Ubuntu 20.04)

pymongo 4.1.0

I am learning machine learning. Because I feel TensorBoard is hard to use, I try to implement a simple "traceable and visible training system" ("tvts") that has partial features of TensorBoard by MongoDB and pymongo. I choose MongoDB because it is document-based, NoSQL, and more suitable for recording arbitrary properties of model training.

Below is how I use it to record training conditions:

import tvts

# before training the modle
ts = tvts.tvts(NAME, '172.26.41.157', init_params={
    'ver': VER,
    'batch_size': N_BATCH_SIZE,
    'lr': LR,
    'n_epoch': N_EPOCH,
}, save_dir=SAVE_DIR, save_freq=SAVE_FREQ)

# after an epoch is done
ts.save(epoch, {
    'cost': cost_avg,
    'acc': metrics_avg[0][0],
    'cost_val': cost_avg_val,
    'acc_val': metrics_avg_val[0][0],
}, save_path)

I write all such data into a collection of my MondoDB, and then I can get statistics and charts like below:

Name: mnist_multi_clf_by_numpynet_mlp_softmax_ok from 172.26.41.157:27017 tvts.train_log
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| train_id | parent    | cost(min:last)  | LR(b-e:max-min) | epoch_count | existed save/save | from                       | to                         | duration       |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 1        | None-None | 1.01055:1.01055 | 0.1:0.1         | 100         | 10/10             | 2022-04-14 11:56:17.618000 | 2022-04-14 11:56:21.273000 | 0:00:03.655000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 2        | 1-100     | 0.56357:0.56357 | 0.1:0.1         | 100         | 10/10             | 2022-04-14 12:00:53.170000 | 2022-04-14 12:00:56.705000 | 0:00:03.535000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 3        | 2-100     | 0.15667:0.15667 | 0.1:0.1         | 300         | 15/15             | 2022-04-14 12:01:35.233000 | 2022-04-14 12:01:45.795000 | 0:00:10.562000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 4        | 3-300     | 0.06820:0.06839 | 0.1:0.1         | 300         | 15/15             | 2022-04-14 18:16:08.720000 | 2022-04-14 18:16:19.606000 | 0:00:10.886000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 5        | 2-100     | 0.03418:0.03418 | 0.5:0.5         | 200         | 10/10             | 2022-04-14 18:18:27.665000 | 2022-04-14 18:18:34.644000 | 0:00:06.979000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 6        | None-None | 1.68796:1.68858 | 0.001:0.001     | 3000        | 30/30             | 2022-04-16 09:15:56.085000 | 2022-04-16 09:18:01.608000 | 0:02:05.523000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+

enter image description here

I found out that it get stuck if I try to get the list of statistics when I densely writing into the collection at the same time. I.e. I try to get the statistics on-the-fly of training and each epoch of the training is very short (about 0.03 second).

But I found out that I can still read out the records by Stuido 3T (a GUI of MongoDB) when I densely writing into the collection.

I googled a lot, but I still cannot solve it. Someone said the writing lock is exclusive (such as link: mongodb write is occuring then a read must wait or not wait?), but why the Studio 3T can make it?

Acturally I am new to MongoDB, I can use it because I have a littel experience with MySQL and in this "tvts" there is only insertion and query, i.e. it is a rahter simple usage of MongoDB. Is there some equivalent concepts of "concurrent inserts" in MySQL? (such as link: concurrent read and write in MySQL) I guess it is not a very hard task of MongoDB to read from it when writing into it.

Although it is a simple simulation of partial features of TensorBoard, I have already coded almost 600 lines of code. So, I am sorry that changing database is not prefered.

Please help me. Thanks a lot!



Solution 1:[1]

Unbelievable! I accidentally solved it just a few minutes after I posted this question. It seems that MongoDB collection could be read even if there are dense insertions, and I guess it is its normal performance. I guess that I cannot google an answer because it is not a real issue. My issue may be caused by the IDE Pycharm that I am using. I have the issue if I run my script inside Pycharm. It is OK when I run it in a system shell window.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cmpltrtok