'Why is the data in my fits file increased when I create a new fits file using the same data?

I have a FITS file whose primary HDU data array is 3000x3000. For context, each pixel measures the flux of a star recorded by a CCD in ADUs (analog-to-digital units), so it can have a value up to 2^16.

The mean value of the array is about 420, and after calibrating it it's 17. After plate solving it, I update the header of the file and overwrite the original data with the calibrated one, saving it to a directory in Jupyter. When I reload the file and print its mean, it is exactly 17 + 2^15, instead of just 17.

from astropy.io import fits
import os
import numpy as np


with fits.open('example.fits') as hdu:
    image_header = hdu[0].header
    image_data = hdu[0].data

# Calibrate and plate solve original data.
# Code is quite lengthy so I omit, I don't think this is the root cause
# because it comes before creating the new file.
calibrated_data = calibrate(image_data)
wcs_header = plate_solve(image_data)

print(np.mean(calibrated_data))
>>>17.45463174982062

image_header.update(wcs_header)  # add in the solved WCS info

# Create a new fits object and add the header and data
new_filename = os.path.join('my_directory', 'example.fits')
hdu = fits.PrimaryHDU()          # create a FITS HDU object
hdu.header.update(image_header)  # add in the header including the plate solved info
hdu.data = calibrated_data       # add in the image data
hdu.writeto(new_filename, overwrite=True)

# Open the new file and print data mean
with fits.open(new_filename) as hdu:
    reloaded_data = hdu[0].data

print(np.mean(reloaded_data))
>>>32785.45463174982

When I do hdu.data = calibrated_data, I add to my new FITS file the exact same data I had when printing the mean, so in theory it shouldn't change. Why does this happen? I cannot think of a reason for 2^15 ADUs being uniformly added to my array.



Solution 1:[1]

I can't know for sure without your complete code, but my guess is that it has to do with the fact that certain variables are mutable.

So, what happens inside a function can at times inadvertently change the values of a variable outside of a function.

For a simple example of this see the following:

def mess_up_array(arr):
    arr[0] += 1
    arr[1] *= 2
    arr[2] /= 2
    arr[3] = 0
    arr[4] = arr[0] * 2

    return arr

values = [1, 2, 3, 4, 0]

print(mess_up_array(values))
print(mess_up_array(values))
print(mess_up_array(values))
print(mess_up_array(values))

At first glance we'd see that values is [1, 2, 3, 4, 0] and that the function should take this array and do some calculations on each value in it.

  • Add 1 to 1
  • Multiply 2 by 2
  • Divide 3 by 2
  • Set 4 to 0
  • And set 0 to (2 * 2 =) 4

The return is only printed, not assigned back to values, so arr and values should remain different right?

Looking at the output, this is clearly NOT the case:

[2, 4, 1.5, 0, 4]
[3, 8, 0.75, 0, 6]
[4, 16, 0.375, 0, 8]
[5, 32, 0.1875, 0, 10]

So the question arises, why did [1, 2, 3, 4, 0] give 4 different answers for the same function?

Or in your case, where in your functions do you add 2^15 to image_data, or something that points to image_data?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 BeRT2me