Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datetime serialization and deserialization fixed #1515

Merged
merged 14 commits into from
May 31, 2022

Conversation

dobosevych
Copy link
Contributor

The purpose of this PR is to fix datetime serialization issue.
expires, as an argument of apply_async is converted to datetime that is then converted to str in serializer, but on deserialization it isn't rollback to datetime (what this PR) actually do.
Related issue: celery/celery#7091

@dobosevych
Copy link
Contributor Author

Fixed tests, but this can lead to backward compatibility issues

@auvipy auvipy requested review from auvipy and matusvalo April 11, 2022 11:20
@auvipy
Copy link
Member

auvipy commented Apr 11, 2022

we need to be careful about backward compatibility

@dobosevych
Copy link
Contributor Author

I fixed lint failures

Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase please

@auvipy
Copy link
Member

auvipy commented Apr 12, 2022

lint is failing

@auvipy
Copy link
Member

auvipy commented Apr 13, 2022

Fixed tests, but this can lead to backward compatibility issues

I want to know more on this

@dobosevych
Copy link
Contributor Author

lint is failing

There was an issue in sphinx-build. I fixed it

@dobosevych
Copy link
Contributor Author

Fixed tests, but this can lead to backward compatibility issues

I want to know more on this

before: datetime -> serialization -> str -> deserialization -> str
after: datetime -> serialization -> str -> deserialization -> datetime

@auvipy
Copy link
Member

auvipy commented Apr 14, 2022

i think the previous behaviour was buggy/ incomplete. this pr make the serialization proper. btw we can mention that on docs/release notes

Copy link
Member

@matusvalo matusvalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ISO 8601 parsing is implemented manually? Cannot be library function used instead:

We need to parse return value of isoformat function so stdlib fromisoformat function should be fine right?

kombu/utils/json.py Outdated Show resolved Hide resolved
kombu/utils/json.py Show resolved Hide resolved
kombu/utils/json.py Outdated Show resolved Hide resolved
kombu/utils/iso8601.py Outdated Show resolved Hide resolved
docs/reference/kombu.utils.iso8601.rst Outdated Show resolved Hide resolved
auvipy and others added 2 commits April 25, 2022 15:50
Co-authored-by: Omer Katz <omer.katz@omerkatz.com>
@varneyo
Copy link

varneyo commented May 12, 2022

Is there any update on this?

@auvipy auvipy added this to the 5.3 milestone May 13, 2022
…etime-serialization

# Conflicts:
#	kombu/utils/iso8601.py
#	kombu/utils/json.py
@auvipy auvipy merged commit c482975 into celery:master May 31, 2022
mvaled added a commit to mvaled/kombu that referenced this pull request Sep 20, 2022
Otherwise Celery jobs start to get `datetime` in place of `date` and that
could lead to errors.

See celery/celery#7754, related PR celery#1515.
mvaled added a commit to mvaled/kombu that referenced this pull request Sep 20, 2022
Otherwise Celery jobs start to get `datetime` in place of `date` and that
could lead to errors.

See celery/celery#7754, related PR celery#1515.
auvipy pushed a commit that referenced this pull request Sep 21, 2022
Otherwise Celery jobs start to get `datetime` in place of `date` and that
could lead to errors.

See celery/celery#7754, related PR #1515.
@kinoute
Copy link

kinoute commented Oct 20, 2022

Are we sure this really fix the deserialisation issue? Using Celery v5.3.0b1 and kombu 5.3.0b2:

app.send_task(...., expires=30)

I added logging around the expires_s variable on Celery file celery/app/base.py

        if expires is not None:
            print('expires', expires)
            print('expires type', type(expires))
            if isinstance(expires, datetime):
                print('datetime TYPE, converting')
                expires_s = (maybe_make_aware(
                    expires) - self.now()).total_seconds()
                print("expires_s", expires_s)
            else:
                expires_s = expires

            if expires_s < 0:

It results in:

[2022-10-20 13:23:43,836: WARNING/MainProcess] expires
[2022-10-20 13:23:43,836: WARNING/MainProcess]
[2022-10-20 13:23:43,837: WARNING/MainProcess] 2022-10-20T15:24:08.278757+02:00
[2022-10-20 13:23:43,837: WARNING/MainProcess] expires type
[2022-10-20 13:23:43,837: WARNING/MainProcess]
[2022-10-20 13:23:43,838: WARNING/MainProcess] <class 'str'>

Which leads, because of the if expires_s < 0 logic, to:

TypeError: '<' not supported between instances of 'str' and 'int'

@kinoute
Copy link

kinoute commented Oct 20, 2022

Same issue while passing a datetime object directly:

app.send_task(..., expires=datetime.now() + timedelta(seconds=30))

@varneyo
Copy link

varneyo commented Oct 27, 2022

Still getting this also, has this been fixed or not?

@auvipy
Copy link
Member

auvipy commented Oct 27, 2022

can anyone work for a fix?

@varneyo
Copy link

varneyo commented Oct 28, 2022

Hi sorry I dont know really anything about the codebase other then looking now quickly. Probably obvious, but the error message is suggesting that the expires value is being passed in as an str (most likely isoformat), therefore it skips the if statements which parse a datetime to an int. Im guessing this is due to a serialisation effect to json for storage/backend?

So something like the below should work:

        if expires is not None:
            if isinstance(expires, datetime):
                expires_s = (maybe_make_aware(
                    expires) - self.now()).total_seconds()
            elif isinstance(expires, (int,float)):
                expires_s = expires
            elif isinstance(expires, str):
                # first try parsing isoformat
                try:
                    expires_s = (maybe_make_aware(
                        datetime.fromisoformat(expires)) - self.now()).total_seconds()
                except ValueError as e:
                    raise ValueError(
                        "Could not parse the expires parameter. Expires must be of type datetime, float, int or datetime str isoformat")
            else:
                raise Exception("Expires parameter must be of type datetime, float, int or datetime str in isoformat")


            if expires_s < 0:
                logger.warning(
                    f"{task_id} has an expiration date in the past ({-expires_s}s ago).\n"
                    "We assume this is intended and so we have set the "
                    "expiration date to 0 instead.\n"
                    "According to RabbitMQ's documentation:\n"
                    "\"Setting the TTL to 0 causes messages to be expired upon "
                    "reaching a queue unless they can be delivered to a "
                    "consumer immediately.\"\n"
                    "If this was unintended, please check the code which "
                    "published this task."
                )
                expires_s = 0

            options["expiration"] = expires_s

@varneyo
Copy link

varneyo commented Jan 16, 2023

Did anyone pick this up at all? Has there been any release that has fixed this yet?

@thedrow
Copy link
Member

thedrow commented Jan 29, 2023

This is not released yet.

@varneyo
Copy link

varneyo commented Apr 19, 2023

Did this make its way into 5.2.7?

@simon-weber
Copy link

We have a bunch of celery tasks that receive dates/datetimes and depend on the old behavior, so this fix was an unwelcome surprise.

Rather than update our tasks or sending code we added a serialization hook to preserve the old behavior. Here it is for anybody else who wants to avoid code changes:

import datetime
from kombu.utils.json import register_type

def legacy_celery_serialize_datelike(datetime_or_date):
    if not isinstance(datetime_or_date, datetime.datetime):
        datetime_or_date = datetime.datetime(
            datetime_or_date.year,
            datetime_or_date.month,
            datetime_or_date.day,
            0,
            0,
            0,
            0,
        )
    r = datetime_or_date.isoformat()
    if r.endswith("+00:00"):
        r = r[:-6] + "Z"
    return r


register_type(
    datetime.datetime, "datetime", legacy_celery_serialize_datelike, lambda d: d
)
register_type(
    datetime.date, "date", legacy_celery_serialize_datelike, lambda d: d,
)

samdoran added a commit to project-koku/koku that referenced this pull request Jul 6, 2023
Kombu 5.3 fixed bugs that incorrectly cast UUID and datetime objects to
strings. We are expected those to be strings in many places, so going to
constrain this for now.

celery/kombu#1575
celery/kombu#1515
https://github.com/celery/kombu/releases/tag/v5.3.0
samdoran added a commit to project-koku/koku that referenced this pull request Jul 6, 2023
Kombu 5.3 fixed bugs that incorrectly cast UUID and datetime objects to
strings. We expect those to be strings in many places.

celery/kombu#1575
celery/kombu#1515
https://github.com/celery/kombu/releases/tag/v5.3.0
@youtux
Copy link

youtux commented Jul 10, 2023

On top of the issue about decoding an object that is a datetime now instead of a str, this change introduces another backwards incompatible change, in fact it even break workers running on an old celery version (5.2.4) when a message is sent by a worker running 5.3.1; the old workers cannot decode the arguments for the new tasks if there was a datetime object!

This should have at least been mentioned in the backwards incompatible changes.

@auvipy
Copy link
Member

auvipy commented Jul 11, 2023

it actually fixed the old buggy implementation.

@LincolnPuzey
Copy link

LincolnPuzey commented Jul 21, 2023

it actually fixed the old buggy implementation.

The 'old buggy implementation' was behavior that projects using celery with json serialization relied on.
As I understand it this change means that when a celery task is published with a datetime argument, the worker receives a datetime object rather than an ISO string. e.g.

@app.task()
def some_task(foo_datetime):
    print(type(foo_datetime), repr(foo_datetime))

>>> some_task.delay(foo_datetime=datetime(2023, 1, 1, 0, 0, 0))

# kombu<5.3.0 output
<class 'str'>
'2023-01-01T00:00:00'

# kombu>=5.3.0 output
<class 'datetime.datetime'>
datetime(2023, 1, 1, 0, 0, 0)

This change in behavior has plenty of potential to break projects, so should've been called out as a breaking change.

I'm not against making breaking changes to fix bugs or make improvements, but they need to be called out as such.

Edit: I can see there is further discussion here: #1749

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

10 participants