-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: to_datetime raises "AttributeError: 'NoneType' object has no attribute 'total_seconds'" even with errors='coerce' #59769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report, I cannot reproduce on 64-bit linux, pandas 2.2.2 nor pandas 2.2.x, with the same versions of NumPy, pytz, and dateutil. Can you post a full stack trace of the error. |
Thanks for the reply. I just reproduced it with a colleague, which is using MacOS and he also run into the same error. Here is the stack trace of MacOS:
|
Thanks, from the Python docs the call to Line 578 in d9cdd2e
can return |
what kind of tzinfo object are you getting back? might be fixable by passing an appropriate pydatetime object to utcoffset, but we wouldn't want to pay the cost of constructing that in the general case. |
I was doing some debugging and it seems that the problem only arises if my own timezone matches the timezone in the brackets! So for example: My own timezone is So if my local tz is
fails whereas
works fine. Interestingly, if I switch my system time to |
Yes, the issue arises when trying to parse a timestamp that contains a timezone in
In that case, In the 2.3.x branch tslib.pyx calls:
which only succeeds with fixed-offset timezones. A DST timezone can only resolve the offset if it knows the timestamp. Somehow this behaviour changed in Ultimately it's better not to parse any Better options:
|
above is reproducible as following:
produces below Traceback
But doesn't happen with Verified that it exists from |
#50791 deprecated (now enforced in main) parsing strings to tzlocal based on the user's |
Unfortunately not, at least as of To reproduce, try to parse a timestamp in the format
|
I said main, not 2.2.3
…On Thu, May 1, 2025 at 12:06 PM Ian Roddis ***@***.***> wrote:
*iroddis* left a comment (pandas-dev/pandas#59769)
<#59769 (comment)>
Unfortunately not, at least as of 2.2.3. The issue is that the current
parse method tries to determine the UTC offset of the parsed timezone
before parsing the naive portion of the timestamp.
To reproduce, try to parse a timestamp in the format %Y-%m-%d %H:%M %Z
with a DST short zone name and a timestamp that falls within the DST range.
[ins] In [1]: import pandas as pd
[ins] In [2]: pd.__version__
Out[2]: '2.2.3'
[ins] In [3]: import time
[ins] In [4]: time.tzname
Out[4]: ('AST', 'ADT')
[ins] In [5]: pd.to_datetime(["2025-01-17 09:19 ADT"]) # Will work, because ADT doesn't start until 2025-03-09
<ipython-input-5-dc275161194d>:1: FutureWarning: Parsing 'ADT' as tzlocal (dependent on system timezone) is deprecated and will raise in a future version. Pass the 'tz' keyword or call tz_localize after construction instead
pd.to_datetime(["2025-01-17 09:19 ADT"]) # Will work, because ADT doesn't start until 2025-03-09
Out[5]: DatetimeIndex(['2025-01-17 09:19:00'], dtype='datetime64[ns]', freq=None)
[ins] In [6]: pd.to_datetime(["2025-03-17 09:19 ADT"]) # Will NOT work, because time is _actually_ in ADT
<ipython-input-6-761725df89ed>:1: FutureWarning: Parsing 'ADT' as tzlocal (dependent on system timezone) is deprecated and will raise in a future version. Pass the 'tz' keyword or call tz_localize after construction instead
pd.to_datetime(["2025-03-17 09:19 ADT"]) # Will NOT work, because time is _actually_ in ADT
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[6], line 1
----> 1 pd.to_datetime(["2025-03-17 09:19 ADT"]) # Will NOT work, because time is _actually_ in ADT
File ~/.asdf/installs/python/3.11.3/lib/python3.11/site-packages/pandas/core/tools/datetimes.py:1099, in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
1097 result = _convert_and_box_cache(argc, cache_array)
1098 else:
-> 1099 result = convert_listlike(argc, format)
1100 else:
1101 result = convert_listlike(np.array([arg]), format)[0]
File ~/.asdf/installs/python/3.11.3/lib/python3.11/site-packages/pandas/core/tools/datetimes.py:435, in _convert_listlike_datetimes(arg, format, name, utc, unit, errors, dayfirst, yearfirst, exact)
432 if format is not None and format != "mixed":
433 return _array_strptime_with_fallback(arg, name, utc, format, exact, errors)
--> 435 result, tz_parsed = objects_to_datetime64(
436 arg,
437 dayfirst=dayfirst,
438 yearfirst=yearfirst,
439 utc=utc,
440 errors=errors,
441 allow_object=True,
442 )
444 if tz_parsed is not None:
445 # We can take a shortcut since the datetime64 numpy array
446 # is in UTC
447 out_unit = np.datetime_data(result.dtype)[0]
File ~/.asdf/installs/python/3.11.3/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:2398, in objects_to_datetime64(data, dayfirst, yearfirst, utc, errors, allow_object, out_unit)
2395 # if str-dtype, convert
2396 data = np.asarray(data, dtype=np.object_)
-> 2398 result, tz_parsed = tslib.array_to_datetime(
2399 data,
2400 errors=errors,
2401 utc=utc,
2402 dayfirst=dayfirst,
2403 yearfirst=yearfirst,
2404 creso=abbrev_to_npy_unit(out_unit),
2405 )
2407 if tz_parsed is not None:
2408 # We can take a shortcut since the datetime64 numpy array
2409 # is in UTC
2410 return result, tz_parsed
File tslib.pyx:414, in pandas._libs.tslib.array_to_datetime()
File tslib.pyx:578, in pandas._libs.tslib.array_to_datetime()
AttributeError: 'NoneType' object has no attribute 'total_seconds'
—
Reply to this email directly, view it on GitHub
<#59769 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB5UM6F4H2MJKABGYGEFPGT24JWC3AVCNFSM6AAAAABN6FO766VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNBVGUZDEMBTGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sorry, I didn't read your response closely enough.
|
Thanks @jbrockmendel and @iroddis - closing. |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I have many dates to parse, some have a TimeZone like "(CET)", "(CST)" and much others, some not. The format is not predictable, so I cannot pass a predefined format string. The shown examples may be similar here, but this is not the case in real life. After some hours of analysis I finally found one specific date, which actually raises an exception.
Sun, 14 Apr 2024 20:00:00 +0200 (CET)
Expected Behavior
First I would expect that with
errors='coerce'
no error will be raised even the format is completely wrong, it should instead return "NaT", as the documentation suggests.Second to me there is no "big" difference between the working date string
Wed, 1 Dec 2021 08:00:00 -0600 (CST)
and the one that raises an errorSun, 14 Apr 2024 20:00:00 +0200 (CET)
. I.e. both have the same format, the biggest difference is the TimeZone abbreviation, which is present in both cases, but different. In fact, if I omit(CET)
, the string can be parsed correctly.As a workaround I could manually check whether a TimeZone abbreviation is present and remove it prior to call
to_datetime
. Especially when the time offset is present as well, this information is kind of redundant, i.e. it should not even be of interest forto_datetime
. But this workaround should not be necessary in my opinion, as I think this is a bug and should be resolved in "to_datetime".See also a similar issue here: #54479. Although I cannot reproduce this in my environment.
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2e
python : 3.12.5.final.0
python-bits : 64
OS : Linux
OS-release : 6.10.7-arch1-1
Version : #1 SMP PREEMPT_DYNAMIC Thu, 29 Aug 2024 16:48:57 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.2
numpy : 2.1.1
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : None
pip : 24.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : None
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: