'Convert 'seconds since October 14, 1582' to Python datetime
I'm trying to convert SPSS timestamps to human readable timestamps such as
data['Completion_date'] = pd.to_datetime(
data['Completion_date']/86400, unit='D',
origin=pd.Timestamp("1582-10-14"))
but get
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "pandas/_libs/tslibs/timestamps.pyx", line 644, in pandas._libs.tslibs.timestamps.Timestamp.__new__
File "pandas/_libs/tslibs/conversion.pyx", line 275, in pandas._libs.tslibs.conversion.convert_to_tsobject
File "pandas/_libs/tslibs/conversion.pyx", line 470, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
File "pandas/_libs/tslibs/conversion.pyx", line 439, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
File "pandas/_libs/tslibs/np_datetime.pyx", line 121, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1582-10-14 00:00:00
Example: 13725072000 should convert to 2017-09-18
Dates in SPSS are recorded in seconds since October 14, 1582, the date of the beginning of the Julian calendar.
How else would I do it?
Solution 1:[1]
/* SPSS SYNTAX CODE WITH EXAMPLES TO UNDERSTAND AND USE. */.
DATA LIST LIST (",")/ MIS_FECHAS(DATETIME).
BEGIN DATA
15/OCT/1582 0:00:00 , /* ES RARO, PERO FECHAS ANTERIORES A ESTA NO SON VALIDAS. LA CIFRA MINIMA DE FECHA EN SPSS ES 86400 SECONDS. */.
01/JAN/1970 0:00:00 ,
01/JAN/2017 0:00:00 ,
03/NOV/2021 1:35:00 ,
END DATA.
COMPUTE NUMERIC_SPSS =MIS_FECHAS*1.
EXECUTE.
COMPUTE NUMERIC_UNIX_SEC = NUMERIC_SPSS - 12219379200.
EXECUTE.
COMPUTE NUMERIC_UNIX_MILISEC = (NUMERIC_SPSS - 12219379200) * 1000.
EXECUTE.
COMPUTE NUMERIC_SPSS_FROM_UNIX_SECS = NUMERIC_UNIX_SEC + 12219379200.
EXECUTE.
COMPUTE NUMERIC_SPSS_FROM_UNIX_MILISECS = (NUMERIC_UNIX_MILISEC/1000) + 12219379200.
EXECUTE.
/**********/.
If you want to use an spss date in python, you have to convert it first to UNIX format.
I think this change in your code in Python could work properly (not tested):
data['Completion_date'] = pd.to_datetime(
(data['Completion_date']-12219379200)/86400, unit='D',
origin=pd.Timestamp("1582-10-14"))
I'm not an expert in python, but the idea is just subtract 12219379200 from spss numeric date, which is the number of seconds from 1582-10-14 (SPSS "origin") to 1970-01-01 (PYTHON origin). The result is a number in UNIX and python standard format.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | marc_s |
