forked from ascoders/weekly
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
308 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,308 @@ | ||
## 1 引言 | ||
|
||
函数缓存是重要概念,本质上就是用空间(缓存存储)换时间(跳过计算过程)。 | ||
|
||
对于无副作用的纯函数,在合适的场景使用函数缓存是非常必要的,让我们跟着 https://whatthefork.is/memoization 这篇文章深入理解一下函数缓存吧! | ||
|
||
## 2 概述 | ||
|
||
假设又一个获取天气的函数 `getChanceOfRain`,每次调用都要花 100ms 计算: | ||
|
||
```jsx | ||
import { getChanceOfRain } from "magic-weather-calculator"; | ||
function showWeatherReport() { | ||
let result = getChanceOfRain(); // Let the magic happen | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
|
||
showWeatherReport(); // (!) Triggers the calculation | ||
showWeatherReport(); // (!) Triggers the calculation | ||
showWeatherReport(); // (!) Triggers the calculation | ||
``` | ||
|
||
很显然这样太浪费计算资源了,当已经计算过一次天气后,就没有必要再算一次了,我们期望的是后续调用可以直接拿上一次结果的缓存,这样可以节省大量计算。因此我们可以做一个 `memoizedGetChanceOfRain` 函数缓存计算结果: | ||
|
||
```jsx | ||
import { getChanceOfRain } from "magic-weather-calculator"; | ||
let isCalculated = false; | ||
let lastResult; | ||
// We added this function! | ||
function memoizedGetChanceOfRain() { | ||
if (isCalculated) { | ||
// No need to calculate it again. | ||
return lastResult; | ||
} | ||
// Gotta calculate it for the first time. | ||
let result = getChanceOfRain(); | ||
// Remember it for the next time. | ||
lastResult = result; | ||
isCalculated = true; | ||
return result; | ||
} | ||
function showWeatherReport() { | ||
// Use the memoized function instead of the original function. | ||
let result = memoizedGetChanceOfRain(); | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
``` | ||
|
||
在每次调用时判断优先用缓存,如果没有缓存则调用原始函数并记录缓存。这样当我们多次调用时,除了第一次之外都会立即从缓存中返回结果: | ||
|
||
```jsx | ||
showWeatherReport(); // (!) Triggers the calculation | ||
showWeatherReport(); // Uses the calculated result | ||
showWeatherReport(); // Uses the calculated result | ||
showWeatherReport(); // Uses the calculated result | ||
``` | ||
|
||
然而对于有参数的场景就不适用了,因为缓存并没有考虑参数: | ||
|
||
```jsx | ||
function showWeatherReport(city) { | ||
let result = getChanceOfRain(city); // Pass the city | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
|
||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
showWeatherReport("London"); // Uses the calculated answer | ||
``` | ||
|
||
由于参数可能性很多,所以有三种解决方案: | ||
|
||
### 1. 仅缓存最后一次结果 | ||
|
||
仅缓存最后一次结果是最节省存储空间的,而且不会有计算错误,但带来的问题就是当参数变化时缓存会立即失效: | ||
|
||
```jsx | ||
import { getChanceOfRain } from "magic-weather-calculator"; | ||
let lastCity; | ||
let lastResult; | ||
function memoizedGetChanceOfRain(city) { | ||
if (city === lastCity) { | ||
// Notice this check! | ||
// Same parameters, so we can reuse the last result. | ||
return lastResult; | ||
} | ||
// Either we're called for the first time, | ||
// or we're called with different parameters. | ||
// We have to perform the calculation. | ||
let result = getChanceOfRain(city); | ||
// Remember both the parameters and the result. | ||
lastCity = city; | ||
lastResult = result; | ||
return result; | ||
} | ||
function showWeatherReport(city) { | ||
// Pass the parameters to the memoized function. | ||
let result = memoizedGetChanceOfRain(city); | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
|
||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
showWeatherReport("Tokyo"); // Uses the calculated result | ||
showWeatherReport("Tokyo"); // Uses the calculated result | ||
showWeatherReport("London"); // (!) Triggers the calculation | ||
showWeatherReport("London"); // Uses the calculated result | ||
``` | ||
|
||
在极端情况下等同于没有缓存: | ||
|
||
```jsx | ||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
showWeatherReport("London"); // (!) Triggers the calculation | ||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
showWeatherReport("London"); // (!) Triggers the calculation | ||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
``` | ||
|
||
### 2. 缓存所有结果 | ||
|
||
第二种方案是缓存所有结果,使用 Map 存储缓存即可: | ||
|
||
```jsx | ||
// Remember the last result *for every city*. | ||
let resultsPerCity = new Map(); | ||
function memoizedGetChanceOfRain(city) { | ||
if (resultsPerCity.has(city)) { | ||
// We already have a result for this city. | ||
return resultsPerCity.get(city); | ||
} | ||
// We're called for the first time for this city. | ||
let result = getChanceOfRain(city); | ||
// Remember the result for this city. | ||
resultsPerCity.set(city, result); | ||
return result; | ||
} | ||
function showWeatherReport(city) { | ||
// Pass the parameters to the memoized function. | ||
let result = memoizedGetChanceOfRain(city); | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
|
||
showWeatherReport("Tokyo"); // (!) Triggers the calculation | ||
showWeatherReport("London"); // (!) Triggers the calculation | ||
showWeatherReport("Tokyo"); // Uses the calculated result | ||
showWeatherReport("London"); // Uses the calculated result | ||
showWeatherReport("Tokyo"); // Uses the calculated result | ||
showWeatherReport("Paris"); // (!) Triggers the calculation | ||
``` | ||
|
||
这么做带来的弊端就是内存溢出,当可能参数过多时会导致内存无限制的上涨,最坏的情况就是触发浏览器限制或者页面崩溃。 | ||
|
||
### 3. 其他缓存策略 | ||
|
||
介于只缓存最后一项与缓存所有项之间还有这其他选择,比如 LRU(least recently used)只保留最小化最近使用的缓存,或者为了方便浏览器回收,使用 WeakMap 替代 Map。 | ||
|
||
最后提到了函数缓存的一个坑,必须是纯函数。比如下面的 CASE: | ||
|
||
```jsx | ||
// Inside the magical npm package | ||
function getChanceOfRain() { | ||
// Show the input box! | ||
let city = prompt("Where do you live?"); | ||
// ... calculation ... | ||
} | ||
// Our code | ||
function showWeatherReport() { | ||
let result = getChanceOfRain(); | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
``` | ||
|
||
`getChanceOfRain` 每次会由用户输入一些数据返回结果,导致缓存错误,原因是 “函数入参一部分由用户输入” 就是副作用,我们不能对有副作用的函数进行缓存。 | ||
|
||
这有时候也是拆分函数的意义,将一个有副作用函数的无副作用部分分解出来,这样就能局部做函数缓存了: | ||
|
||
```jsx | ||
// If this function only calculates things, | ||
// we would call it "pure". | ||
// It is safe to memoize this function. | ||
function getChanceOfRain(city) { | ||
// ... calculation ... | ||
} | ||
// This function is "impure" because | ||
// it shows a prompt to the user. | ||
function showWeatherReport() { | ||
// The prompt is now here | ||
let city = prompt("Where do you live?"); | ||
let result = getChanceOfRain(city); | ||
console.log("The chance of rain tomorrow is:", result); | ||
} | ||
``` | ||
|
||
最后,我们可以将缓存函数抽象为高阶函数: | ||
|
||
```jsx | ||
function memoize(fn) { | ||
let isCalculated = false; | ||
let lastResult; | ||
return function memoizedFn() { | ||
// Return the generated function! | ||
if (isCalculated) { | ||
return lastResult; | ||
} | ||
let result = fn(); | ||
lastResult = result; | ||
isCalculated = true; | ||
return result; | ||
}; | ||
} | ||
``` | ||
|
||
这样生成新的缓存函数就方便啦: | ||
|
||
```jsx | ||
let memoizedGetChanceOfRain = memoize(getChanceOfRain); | ||
let memoizedGetNextEarthquake = memoize(getNextEarthquake); | ||
let memoizedGetCosmicRaysProbability = memoize(getCosmicRaysProbability); | ||
``` | ||
|
||
`isCalculated` 与 `lastResult` 都存储在 `memoize` 函数生成的闭包内,外部无法访问。 | ||
|
||
## 3 精读 | ||
|
||
### 通用高阶函数实现函数缓存 | ||
|
||
原文的例子还是比较简单,没有考虑函数多个参数如何处理,下面我们分析一下 Lodash `memoize` 函数源码: | ||
|
||
```jsx | ||
function memoize(func, resolver) { | ||
if ( | ||
typeof func != "function" || | ||
(resolver != null && typeof resolver != "function") | ||
) { | ||
throw new TypeError(FUNC_ERROR_TEXT); | ||
} | ||
var memoized = function () { | ||
var args = arguments, | ||
key = resolver ? resolver.apply(this, args) : args[0], | ||
cache = memoized.cache; | ||
|
||
if (cache.has(key)) { | ||
return cache.get(key); | ||
} | ||
var result = func.apply(this, args); | ||
memoized.cache = cache.set(key, result) || cache; | ||
return result; | ||
}; | ||
memoized.cache = new (memoize.Cache || MapCache)(); | ||
return memoized; | ||
} | ||
``` | ||
|
||
原文有提到缓存策略多种多样,而 Lodash 将缓存策略简化为 key 交给用户自己管理,看这段代码: | ||
|
||
```jsx | ||
key = resolver ? resolver.apply(this, args) : args[0]; | ||
``` | ||
|
||
也就是缓存的 key 默认是执行函数时第一个参数,也可以通过 `resolver` 拿到参数处理成新的缓存 key。 | ||
|
||
在执行函数时也传入了参数 `func.apply(this, args)`。 | ||
|
||
最后 `cache` 也不再使用默认的 Map,而是允许用户自定义 `lodash.memoize.Cache` 自行设置,比如设置为 WeakMap: | ||
|
||
```jsx | ||
_.memoize.Cache = WeakMap; | ||
``` | ||
|
||
### 什么时候不适合用缓存 | ||
|
||
以下两种情况不适合用缓存: | ||
|
||
1. 不经常执行的函数。 | ||
2. 本身执行速度较快的函数。 | ||
|
||
对于不经常执行的函数,本身就不需要利用缓存提升执行效率,而缓存反而会长期占用内存。对于本身执行速度较快的函数,其实大部分简单计算速度都很快,使用缓存后对速度没有明显的提升,同时如果计算结果比较大,反而会占用存储资源。 | ||
|
||
对于引用的变化尤其重要,比如如下例子: | ||
|
||
```jsx | ||
function addName(obj, name){ | ||
return { | ||
...obj, | ||
name: | ||
} | ||
} | ||
``` | ||
|
||
为 `obj` 添加一个 key,本身执行速度是非常快的,但添加缓存后会带来两个坏处: | ||
|
||
1. 如果 `obj` 非常大,会在闭包存储完整 `obj` 结构,内存占用加倍。 | ||
2. 如果 `obj` 通过 mutable 方式修改了,则普通缓存函数还会返回原先结果(因为对象引用没有变),造成错误。 | ||
|
||
如果要强行进行对象深对比,虽然会避免出现边界问题,但性能反而会大幅下降。 | ||
|
||
## 4 总结 | ||
|
||
函数缓存非常有用,但并不是所有场景都适用,因此千万不要极端的将所有函数都添加缓存,仅限于计算耗时、可能重复利用多次,且是纯函数的。 | ||
|
||
> 讨论地址是:[精读《函数缓存》· Issue #261 · dt-fe/weekly](https://github.com/dt-fe/weekly/issues/261) | ||
**如果你想参与讨论,请 [点击这里](https://github.com/dt-fe/weekly),每周都有新的主题,周末或周一发布。前端精读 - 帮你筛选靠谱的内容。** | ||
|
||
> 关注 **前端精读微信公众号** | ||
<img width=200 src="https://img.alicdn.com/tfs/TB165W0MCzqK1RjSZFLXXcn2XXa-258-258.jpg"> | ||
|
||
> 版权声明:自由转载-非商用-非衍生-保持署名([创意共享 3.0 许可证](https://creativecommons.org/licenses/by-nc-nd/3.0/deed.zh)) |