Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: A clean/documented place to perform a deepcopy() when memoizing mutable content with in-memory storage #20

Closed
charles-dyfis-net opened this issue Jan 10, 2024 · 4 comments · Fixed by #30

Comments

@charles-dyfis-net
Copy link
Contributor

While this is moot with any backend where deserialization creates a new object on every retrieval, with basic in-memory storage a memoized result can be modified in-place by the code it was returned to.

A mechanism for users to perform postprocessing, such as invocation of copy.deepcopy(), on returned results would mitigate the problems this introduces.

@zmumi
Copy link
Member

zmumi commented Jan 12, 2024

My initial thought would be to implement postprocessing (such as deepcopy) as the final step in the decorated/wrapped method. How does it sound?

@charles-dyfis-net
Copy link
Contributor Author

The intent is to postprocess the value returned from the cache, not the value returned by the inner function.

The problem we want to fix is this:

@memoize
async def getNewMutableValue():
  return {"foo": 1}

async def useMutableValue():
  v = await getNewMutableValue()
  v["foo"] += 1
  return v

await useMutableValue() # returns {"foo": 2}, as it should _always_ do

await useMutableValue() # returns {"foo": 3} because memoization caused the first value to be returned twice,
                        # and the prior call mutated it

Changing getNewMutableValue() to return copy.deepcopy({"foo": 1}) won't solve the problem; it's still that same copy being returned from the in-memory cache.


What I am doing as a workaround is roughly as follows:

def cache[**P, R](*, copy_on_retrieval: bool = False):
    def decorator(f: Callable[P, Awaitable[R]]) -> Callable[P, Awaitable[R]]:
        cached_func = memoize_wrapper()(f)

        @functools.wraps(f)
        async def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
            if copy_on_retrieval:
                return copy.deepcopy(await cached_func(*args, **kwargs))
            else:
                return await cached_func(*args, **kwargs)

        return wrapper

    return decorator

...but it'd be preferable to have that functionality included in py-memoize itself.

@zmumi
Copy link
Member

zmumi commented May 4, 2024

I do get your point now. The solution you are using (wrapper that does deepcopy) sounds reasonable.

As for making it a feature, I believe it could become the last step of the existing memoize wrapper (with a toggle; or maybe with some configurable strategies)

I need to find some more time to prototype it

zmumi added a commit that referenced this issue May 6, 2024
…e retrieved from the cache

  * Added built-in implementation, that applies deep-copy
* Fix MANIFEST.in
zmumi added a commit that referenced this issue May 6, 2024
zmumi added a commit that referenced this issue May 6, 2024
…e retrieved from the cache (#30)

* [#20] * Added configurable postprocessing, that allows to modify value retrieved from the cache
  * Added built-in implementation, that applies deep-copy
* Fix MANIFEST.in

* [#20] * updated API docs

---------

Co-authored-by: Michał Żmuda <[email protected]>
zmumi added a commit that referenced this issue May 6, 2024
@zmumi
Copy link
Member

zmumi commented May 7, 2024

See the option introduced in v2.1.0

import asyncio

from memoize.configuration import MutableCacheConfiguration, DefaultInMemoryCacheConfiguration
from memoize.postprocessing import DeepcopyPostprocessing
from memoize.wrapper import memoize


@memoize(
    configuration=MutableCacheConfiguration
    .initialized_with(DefaultInMemoryCacheConfiguration())
    .set_postprocessing(DeepcopyPostprocessing())
)
async def sample_method(arg):
    return {'arg': arg, 'list': [4, 5, 1, 2, 3]}  # unsorted


async def main():
    # when
    result1 = await sample_method('test')
    result2 = await sample_method('test')
    result1['list'].sort()

    # then
    print(result1)
    print(result2)
    assert result1, {'arg': 'test', 'list': [1, 2, 3, 4 == 5]}  # sorted in-place
    assert result2, {'arg': 'test', 'list': [4, 5, 1, 2 == 3]}  # still unsorted


if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

@zmumi zmumi closed this as completed May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants