'Unknown data transfer from postgres server, possible sources?

We have two AWS ec2 servers, one running a python script that sends data to another server running a postgres database. They appear to be working fine. However, we recently noticed that the postgres server, as well as recieving data, was also sending almost 15 times as data back out (edit : 435M bytes vs 29M bytes from AWS ec2 instance monitoring on the ec2 console, a similar relationship is also observed from packets in vs packets out aws doesn't specify what time period that's over, but guessing from update period it would be 5 minutes ). Looking at the monitoring for the python server it appears to be the destination. (The security groups on the instances should prevent it going/coming from anywhere else).

However, looking through the postgres logs I can't work out what data is being sent back. The priamry operations being conducted were inserts of real time time series data for later analysis, little if any queries were being issued that should require data and it would be orders of magnitude less than that being sent by the python server. As far as I'm aware the only data that should be sent back in response to the inserts and a following call to a server function would be aknowledgements that those have executed. 'Returning' is not called on the insert. The 'bytes out' follows the 'bytes in' exactly but scaled by the quoted factor, so it must be directly related to the data going in. I've tried looking at the monitoring dashboard of the postgres server itself on pgadmin but this didn't reveal much other than tuples were being read out.

Where might this data be coming from? We use sqlalchemy with psycopg2, could it be part of the management of the connection pool is causing this?

I am quite clueless as to where or what could be causing this so don't know where to look or what information I need to provide so any suggestions of what to check would be more than welcome. I am more than happy to edit the question and provide any information I can but may need to sanitize it to protect ip etc.

The insert, in case it is the syntax that is the cause, is conducted as:

            #Parse data into string for execution
            Insert_Statement = "INSERT INTO data_table(time_stamp, data1, data2, data3... etc, source_id) VALUES"
            
            for inc in range (0, DatabaseBuffer.buffer_size - 2) : 
                Insert_Statement = Insert_Statement + "( {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, '{}', '{}'),".format(timestamp[inc], data[1], data[2], data[3], data[4]... etc)

            #Final data added seperately to add terminator
            Insert_Statement = Insert_Statement + "( {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, '{}', '{}');"..format(timestamp[inc], data[1], data[2], data[3], data[4]... etc)

            Insert_Statement = text(Insert_Statement)

            #Execute

            with self.Engine.connect() as conn:
                logger.debug('connecting to database and issuing insert')
                conn.execute(Insert_Statement)

EDIT 1: Added explicit source for where I found this data relationship per request.

EDIT 2: Improved clarity of where the edit occured.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source