'(Solved) How to read from mongodb collection by pymongo when densely writing into it?
Update:
It is solved. Please check myself's answer if you are interested in it. Thanks to everyone all the same!
My original post:
MongoDB server version: 3.6.8 (WSL Ubuntu 20.04)
pymongo 4.1.0
I am learning machine learning. Because I feel TensorBoard is hard to use, I try to implement a simple "traceable and visible training system" ("tvts") that has partial features of TensorBoard by MongoDB and pymongo. I choose MongoDB because it is document-based, NoSQL, and more suitable for recording arbitrary properties of model training.
Below is how I use it to record training conditions:
import tvts
# before training the modle
ts = tvts.tvts(NAME, '172.26.41.157', init_params={
'ver': VER,
'batch_size': N_BATCH_SIZE,
'lr': LR,
'n_epoch': N_EPOCH,
}, save_dir=SAVE_DIR, save_freq=SAVE_FREQ)
# after an epoch is done
ts.save(epoch, {
'cost': cost_avg,
'acc': metrics_avg[0][0],
'cost_val': cost_avg_val,
'acc_val': metrics_avg_val[0][0],
}, save_path)
I write all such data into a collection of my MondoDB, and then I can get statistics and charts like below:
Name: mnist_multi_clf_by_numpynet_mlp_softmax_ok from 172.26.41.157:27017 tvts.train_log
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| train_id | parent | cost(min:last) | LR(b-e:max-min) | epoch_count | existed save/save | from | to | duration |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 1 | None-None | 1.01055:1.01055 | 0.1:0.1 | 100 | 10/10 | 2022-04-14 11:56:17.618000 | 2022-04-14 11:56:21.273000 | 0:00:03.655000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 2 | 1-100 | 0.56357:0.56357 | 0.1:0.1 | 100 | 10/10 | 2022-04-14 12:00:53.170000 | 2022-04-14 12:00:56.705000 | 0:00:03.535000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 3 | 2-100 | 0.15667:0.15667 | 0.1:0.1 | 300 | 15/15 | 2022-04-14 12:01:35.233000 | 2022-04-14 12:01:45.795000 | 0:00:10.562000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 4 | 3-300 | 0.06820:0.06839 | 0.1:0.1 | 300 | 15/15 | 2022-04-14 18:16:08.720000 | 2022-04-14 18:16:19.606000 | 0:00:10.886000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 5 | 2-100 | 0.03418:0.03418 | 0.5:0.5 | 200 | 10/10 | 2022-04-14 18:18:27.665000 | 2022-04-14 18:18:34.644000 | 0:00:06.979000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
| 6 | None-None | 1.68796:1.68858 | 0.001:0.001 | 3000 | 30/30 | 2022-04-16 09:15:56.085000 | 2022-04-16 09:18:01.608000 | 0:02:05.523000 |
+----------+-----------+-----------------+-----------------+-------------+-------------------+----------------------------+----------------------------+----------------+
I found out that it get stuck if I try to get the list of statistics when I densely writing into the collection at the same time. I.e. I try to get the statistics on-the-fly of training and each epoch of the training is very short (about 0.03 second).
But I found out that I can still read out the records by Stuido 3T (a GUI of MongoDB) when I densely writing into the collection.
I googled a lot, but I still cannot solve it. Someone said the writing lock is exclusive (such as link: mongodb write is occuring then a read must wait or not wait?), but why the Studio 3T can make it?
Acturally I am new to MongoDB, I can use it because I have a littel experience with MySQL and in this "tvts" there is only insertion and query, i.e. it is a rahter simple usage of MongoDB. Is there some equivalent concepts of "concurrent inserts" in MySQL? (such as link: concurrent read and write in MySQL) I guess it is not a very hard task of MongoDB to read from it when writing into it.
Although it is a simple simulation of partial features of TensorBoard, I have already coded almost 600 lines of code. So, I am sorry that changing database is not prefered.
Please help me. Thanks a lot!
Solution 1:[1]
Unbelievable! I accidentally solved it just a few minutes after I posted this question. It seems that MongoDB collection could be read even if there are dense insertions, and I guess it is its normal performance. I guess that I cannot google an answer because it is not a real issue. My issue may be caused by the IDE Pycharm that I am using. I have the issue if I run my script inside Pycharm. It is OK when I run it in a system shell window.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | cmpltrtok |

